DevOps: not just for web applications

sun_curl_800_grande
Riding the wave of DevOps and continuous delivery. “Sun Curl” by ClarkLittlePhotography.com

Two years ago I didn’t think DevOps applied to my company. I was aware of the buzz word, but when I took a cursory look at a few articles about DevOps, they seemed to be all about web applications and web services. My company primarily develops firmware and enterprise software. In our world, production is owned by our customers or even our customers’ customers. We don’t deploy to a production environment that we own and operate – we just post new installers. As for frequent releases, our product managers weren’t asking for them.

Dawning awareness

I didn’t seriously consider DevOps at my company until a colleague, one of the other product development directors, suggested that we consider building a DevOps team (yes, anti-pattern alert) and that my department – Software Quality – might be a good home for this new team.

Software Quality already had a central team that was building and maintaining an automated system test framework and common test environments for manual testing and test automation. Also, a number of the product development builds were already owned and maintained by people in Software Quality. We were also driving some of our product dogfood programs, which meant working closely with our corporate IT department to get people across the company using pre-release versions of our software in their day-to-day jobs, and to troubleshoot issues. In a way, it did seem like parts of Software Quality were acting a bit like an internally facing Ops group combined with QA, and we were already embedded in scrum teams working alongside developers, so the new buzz word, which advertises itself as the intersection between development, QA, and operations seemed like a possible fit.

We started out with a series of meetings to talk about a possible DevOps initiative and what it might look like at our company. The head of corporate IT, the two software development directors, and the manager of the group building out our test framework and test infrastructure all took part. We reviewed the DevOps Wikipedia article which is quite brief but gets the general idea across. The concepts sounded generally applicable, even to an organization producing enterprise applications. Our challenges are different from the ones you read about in many DevOps blog articles – but it seemed like the core DevOps concepts apply just as much to us as they do any other company.

We all agreed to a rough plan that each of us would start by contributing one of our people to form a new DevOps team. The idea was that our people would maintain their formal reporting lines, but “dotted line” to the DevOps team. The DevOps team’s work would be managed by the Software Quality manager who was already managing the team looking after our test framework and test infrastructure. He would manage the work using a kanban-style system. The team would be responsible for the shared development and test infrastructure and the various shared tools we relied on for development and testing. They would also develop automation to make internal deployments of virtual machines, our software and firmware, and the 3rd party software that these depend on – easier using tools like Chef and OpenStack.

What happened next

Of course things don’t always go exactly as planned. IT was short-staffed and could only dedicate half of one person’s time to the new team. Also, only one of the software development directors was able to spare a person, and that was done by transferring a headcount for a new hire instead of maintaining line reporting back to the development organization. We expected these to be fairly temporary concessions as new budget years approached, but of course priorities are always shifting and things often last longer than you expect.

Before we had the DevOps team, each scrum team or small group of scrum teams owned their own development and test virtual machines, servers, physical machines and devices for development and testing. The team that was creating the common automated test framework was managing a limited pool of  common resources, but welcomed the chance to pass that infrastructure work over to the new team. With a new centralized team in place with a background in corporate IT operations, we volunteered to take ownership of all of the team development and test environments so that feature teams could focus on product development and testing, and so that infrastructure could be shared between teams, who could reserve what they needed using a custom inventory system that was also used for test automation reservations.

With our new DevOps team smaller than planned, we found that while teams appreciated not having to purchase and manage their own infrastructure, sometimes they had to wait longer than they would have liked for new setups and for changes to be made. Worse, with so many urgent requests coming in from the teams – often with little notice – our DevOps team was spending almost all of their time on tickets. Our automation plans suffered. We managed to make some progress on automated configuration with Chef and made some inroads into deploy-on-demand with OpenStack, but far less than we would have liked.

Making room for what matters

The goal had always been to quickly put in place automation that would make it easier to give the teams what they needed and to allow many things that required DevOps assistance to become self-service. We were in a chicken-and-egg situation though. Without the automation in place to take the manual load off DevOps, the important but not urgent work of automation kept getting preempted by important and urgent requests from the teams. We were running as fast as we could but not making much forward progress.

We’re now focusing on ways to enable the feature teams to handle more of their own requests manually to buy us the time to develop the common automation we need. This will require some training for key individuals from the feature teams, since the ways things are done now are different from how they were done in the past: we have new standards around security and permissions, Chef and OpenStack are now part of the picture, and there’s the inventory system to consider too. It isn’t always obvious how to add multiple related components to the inventory system so that they’ll be correctly grouped together, and the way we model system components often has to be changed as new products and platforms come into the picture.

Inspiration for more change

I recently read a great article and interview with Gary Gruver (LinkedIn, @GruverGary) about how HP adopted a DevOps culture of extensive test automation and continuous internal delivery for their printer firmware products. I’ve been sharing this article with everyone I can, and I’m now half-way through the excellent book “A Practical Approach to Large-Scale Agile Development: How HP Transformed LaserJet FutureSmart Firmware“, which presents the full story in more detail.

The article was great for many reasons. First, it was evidence that continuous integration and continuous delivery can work for large and complex firmware systems like ours. Second, it provided some amazing numbers (8X improvement in the capacity for innovation!) that show that the effort had a huge payoff. Finally, as one of the change leaders, it was heartening to read that a massive amount of resistance to change is normal, and better yet, how even the biggest detractors can become the biggest fans once they experience the results.

When I talk to people almost to an engineer until they have worked in this type of environment they will never believe it can work. Then once they have developed in this type of environment they can’t imagine how they could ever go back.
— Gary Gruver

The book is providing ideas and guidance on where to focus next in our own transformation. It won’t be quick or easy for us. Like HP, we’re dealing with firmware and quick and efficient testing will involve building more simulation and emulation software, which requires time and commitment from our development teams. We also have a history of over-reliance on system level tests and manual tests with not enough unit and integration tests, which means we have a lot of work to do to get our automated testing up to snuff. Feature teams are already working as fast as they can to build new products and features and to fix bugs. There’s nobody to pass that work on to, so to achieve a good level of test and deployment automation will require a commitment to go slower on features in the short term so that we can develop the capability to go faster in the longer term.

Faster external deliveries

Like the HP FutureSmart firmware group, our company releases the software and firmware we develop quarterly, and many people in my organization and key customers see this as fast enough. There’s also a significant and vocal faction both inside and outside of the organization who view quarterly releases as too fast. They offer as evidence that many of our customers don’t want to upgrade so frequently, that many of our features take more than three months to implement anyway, and that many of our quarterly releases don’t contain a lot of visible added value in terms of new innovative capabilities. They mention that there’s a lot of overhead involved in going from development and testing complete to generally available. They point out that, unlike many companies who deploy frequently to production, we don’t own production or even sell directly to end customers. Instead our products are used as a platform by our direct customers who integrate it with their own customized value-add software. This means that even if we deliver software faster to our customers, there might be long delays before our customers integrate and release that software to their customers.

While the facts that the “slow down” faction presents are valid today, I think we’re in need of a mind shift similar to the one Gary Gruver talks about with the HP developers on the business side to help them see the value of driving – not for slower deliveries – but for faster ones. Frequent delivery of value to customers is a core agile principle, and new and innovative features aren’t the only form of value. Bug fixes, security improvements, maintainability improvements, and usability improvements provide value too. Fast feedback, not just from internal testing but from real customers is needed if we want to be able to adjust, adapt, and apply our efforts in the most productive way.

To the internal people and key customers who say we should slow down and deliver less often, I want to point out that most of today’s facts aren’t necessarily fixed. If customers don’t want to upgrade frequently, why is that? What can we do to help make their upgrades quick, painless, and low risk? What can we do to help our customers to also see the value of taking changes in small batches? Like the pre-transformation HP, we aren’t producing a lot of innovative and flashy features quickly, but that should improve as we continue our transformation. In the meantime, the less-visible improvements are still improvements. Why wouldn’t we want our customers, new and existing, to benefit from these as soon as they can? What kind of message are you sending to your development teams if you say their software is “not worth the effort to release”? Why can’t we apply the same kinds of agile principles and automation used for internal releases to external releases? Releasing less frequently just gives us a continued excuse to use a less efficient release process.

What do you think?

I’d love to hear from anyone who has been through, or is going through a similar transformation towards continuous integration, continuous delivery, and DevOps practices. I’d especially love to hear about experiences involving enterprise software and firmware, where the resistance to change seems to be stronger than in other types of software development – such as mobile or web applications.

Is all of this worth the significant effort it will take? I clearly believe that the answer is a resounding “yes” but others are not so convinced, and true stories and experiences relayed by people that they can relate to are better and more convincing than abstract theories.

Thanks for reading, and I hope to hear from you soon!

 

 

Leave a Reply