This is the third and final article in a series discussing personal experiences with QA and continuous delivery. The first article is located here; the second is here.
I have a confession to make: In past articles discussing Quality Assurance and continuous delivery models, I implied – more or less – that the products I manage follow a continuous delivery process. While we did implement many aspects of continuous delivery, I’d be stretching the truth if I said we followed such a model fully. Let me explain…
At the time of the last article, the process included automated unit tests and a fully automated deployment pipeline. Specifically, the build server yanked code from source control, executed the automated tests, and then pushed the code to all servers, including the remote disaster recovery site. The process aborted if any of the tests failed, and we even built hooks into GitHub to automatically trigger the build when code was merged into master. That was all great stuff. However, we still developed using a source control branch for DEV, and another for QA. Ultimately, this meant features would “sit” in a branch until QA got around to validating them. In reality we were one-deploy-per-several-days continuous; not true continuous.
I’ll readily admit I was the hold-up. The QA team did find defects, and by my thinking those bugs would be pushed directly to production in a true continuous model. I did not want that to happen on my watch. But the passion of my team finally won me over and I was convinced – cajoled, really – to take the final step to true continuous deployment. The change happened May 6th. As I write, we are well past the New Year and the team has had over to eight months of continuous experience under its collective belt. How has it been? In a word, Awesome. I would never go back to the old model, and I’m certain product stakeholders would agree with me.
Before I discuss the results, let us take a quick look at the process as it looks today:
- Developers fork the git repository. They work locally, and commit to the fork daily.
- As soon as a single feature is complete, a pull request is initiated. In this context, the word “complete” means:
- All new/changed code is submitted with corresponding unit tests.
- If the product team provided acceptance criteria, then there must be a unit test verifying said criteria.
- All tests must pass
- The code most meet all internal standards.
- A senior developer reviews the code change via the pull request
- Note that the review typically does not take long because the code change (single feature, or part of a single feature) pending deployment is typically very small.
- Code standards are enforced at the review point.
- Once review comments are addressed, the senior dev merges the code to master.
- The deployment process uses GitHub WebHooks to trigger the build agent when a merge to master is detected. Once triggered, the CI server executes the build process.
- Server architecture was specifically designed and configured to allow for zero down-time deployments, which means we can (and do) deploy during the day. Multiple times in fact.
- The developer coding the update is responsible for a verification check when the feature makes it to production.
The results? Anecdotally, stakeholders love it. They see features in production much more quickly than ever before. Tech loves it because they don’t have to “sit on” completed features anymore. They push code as soon as its done and then move onto the next task. A nice side bonus is a dramatic decrease of the number of gnarly code conflicts we might encounter when merging a few days – or even a week’s worth – of development into master right before deployment. And we no longer experience those infrequent but very hard to track down issues where a feature works on QA during testing, but then fails for some mysterious reason when deployed to production.
So far so good! But what about bugs? Surely, skipping the formal QA step must increase the risk of pushing a bug? As mentioned above, this was my point of friction: I knew defects were caught by QA in our previous model. Theoretically, those bugs would be deployed directly to production and consumed by users after we moved to true continuous. I shuddered thinking through a scenario where the VP called to ask how a bug made it through to production and only be able to respond with, “it was deployed straight to production without QA”.
Two strategies mitigate this risk: feature hiding, and “staged” deployments. I’ll describe both plus the role QA plays within our continuous process. (Hint: it is not exactly what I postulated at the end of “Quality Assurance: Is it Still Relevant?”)
Feature hiding.
This is a method where functionality is guarded by a permission, or a switch. When the switch is off, or a user without the permission navigates to the application, the functionality is inaccessible. This allows QA to validate (on production) with no worry the feature will be consumed by an unintended party. It also allows us to execute a metered roll-out to subset of the full user population. As a final benefit, if we do happen to find an issue, all that is required is to remove the feature instantly is a simple button-click to toggle the feature off.
Staged Feature Roll-outs
This is a concept where we take a large feature set that might take several weeks (or even months) to develop, and we break it up into small modular chunks that are deployed as soon as they are complete. While the Individual chunks are not fully functional, they can perform some task that may be measured, tested and validated. In some cases, the chunks might be useful enough on their own to be used by customers before the full feature set is complete. In other cases the chunks may sit hidden, used by internal staff and QA only, until all the pieces are deployed and stitched together.
In either case, Feature Hiding and Staged Roll-outs allow us to deploy to production frequently. Why is this so special? First, since we are deploying often, each deployment is smaller, And smaller changes dramatically reduce complexity. Secondly, in the case when we do encounter a deployment issue, smaller changes are an order of magnitude easier to isolate and troubleshoot. Gone are those late nights trying to analyze a month’s worth of features (in many thousands of lines of code changes) for a bug that slipped through to production the day after a deployment.
Role of QA.
In the continuous world, the role of my QA team changed from a focus on manual testing to the creation and maintenance of automated integration tests. These tests run against production every night to verify all is in working order. For larger projects, or for projects with very high impact, they still engage manual testing, but this time they do it on production via an application switch.
Does what I’ve written so far sound too good to be true? Are you skeptical that this process can generate better quality code? I’m glad you asked, because as previously noted this was a significant concern of mine. So what did I do? I analyzed quality before and then after continuous delivery by tracking bugs. The following chart shows bug history over the past year.
True continuous was implemented early May. The data suggests:
- True continuous deployment has not increased the total number of bugs deployed to production. It is difficult to tell from the graph, but if you compare total bugs before and after, the total after is slightly lower than before.
- Priority bugs appear to be addressed more quickly under a continuous model, which is what one would expect.
What is not shown on this graph are the other benefits – quicker time to market, increased developer satisfaction, and better morale resulting from dramatically fewer deployment conflicts, etc. If I piqued your interest, please keep in mind that switching to continuous is an enormous cultural and technical change. Astute readers will notice the dates on my articles and guess that the process took us around six months to implement. Here are some of the major hurdles and tasks we faced:
- Research and test time required to find and validate a zero-down-time application server.
- R&D the technical integration between TeamCity, GitHub, and our platform build agent.
- Implement new standards and processes governing unit test creation, code review, and deployment. We rolled this out slowly and were keen to get buy-in from senior development staff. This bullet is one of the “big” ones. It is my opinion that a continuous process cannot be successful without aggressive compliance to unit test creation, code review, and code standards.
- Work with the product team to educated and get buy-in.
- Work with the QA team to help them learn new skills (code test automation) and to work with our new process.
Yes, it is a ton of work. However, after making the journey I have no reservations saying the effort was well worth it. I don’t think I’d ever willingly return to once per month, or even once per week, releases. Continuous is the way to go.