This is a guest post written by Eyal Keren, CTO at Rollout.io, makers of the Secure Feature Management System for enterprises.
I’m going to ask you a question that’s less philosophical than it sounds. How do you know when an issue is done? And I mean done-done. This tends to be deceptively hard to answer.
Assuming you use issue-tracking software, you might be tempted to say, “It’s done when we mark it done.” But then the follow-up qualifiers start. “It’s done when we mark it done. Unless we decide not to do it and mark it with ‘won’t do’ or we can’t reproduce the issue and we mark it with ‘cannot reproduce.’ Oh, and some issues are duplicates, of course.” You get the idea.
But there’s complexity beyond that, as well.
Let’s think less in terms of status field specifics and more in terms of conceptual workflows. This will eliminate the one-off statuses and allow us to think about what it really means for an issue to be complete.
In most shops, “done” is a purely internal concern
Perhaps it starts with a bug report via email:
Login screen periodically hangs with Firefox on Android when users enter valid permissions and click “Log In.”
Let’s say that our hypothetical development shop is, luckily, a sophisticated one. It knows how to automatically convert emails to Jira issues for prioritization and routing to the proper person. And so the issue’s lifecycle begins.
From there, our reported defect makes its way through an initial support team, which then escalates it to engineering. Here, a project manager assigns it to a software developer, who goes in to take a look during the next development cycle. The developer takes a look, trying to reproduce the problem, and…
Oh, sure enough!
There’s a subtle bug in the server side code handling the POST that would only manifest itself under these conditions. The developer writes a test that exposes the defect, fixes it, and then promotes it into CI and for eventual deployment. From a workflow perspective, the issue becomes “ready for test.”
Now, let’s say that this sophisticated shop has also incorporated test suite management. This allows strategic automation to flag the issue as verified in some capacity, based either on the results of the developer’s unit test or the QA folks running their suite of tests. And from there, perhaps there’s a final sign-off from a product manager to tie a nice bow on the issue.
Surely, now it’s both done and done–done. Right?
Is an issue ever really “done”?
You’ve recorded, reproduced, fixed, and verified the issue, so now your organization declares it done. Yay!
Now what? Well, now, of course, you hurl it over the wall into production with your next release. And once it’s there, you hunker down and wait for issue reports of all kinds, including our little login issue. It’s pins and needles for a day or two, but then you start to relax. Problem reports don’t flood in. Support personnel and operations aren’t overwhelmed. Things seem okay.
Now, as long as nobody calls you up to yell about login problems, you feel good about calling this issue done.
But wait a minute. If, at any given moment, there’s a chance that some user will engage in issue necromancy and bring it back from the dead, is it really done? Or is it in some kind of Schrodinger’s Box of done-ness? Except for it kind of feels like you’re in the box since, at any given moment, an entirely opaque process out there might ruin your day.
A brief history of issue tracking
Based on what I’m saying here, you might think I have a low opinion of issue management. Quite the contrary. I may be dating myself a bit with this, but I remember my earliest days of issue management involving a brand-spanking new tool called Bugzilla.
Well, if I’m being honest, my very first issue tracking tool was a Notepad file with an issue per line. Then, when that proved insufficient, I remember increasingly complex spreadsheets with weird little bits of VBA automation. So when we first started using Bugzilla, I was ecstatic to have such a wonderful tool that relieved us of the maintenance burden and brought sanity to issue management.
In those early days, these were relatively simple database applications. And they were somewhat few and far between, especially at the entry level of the market. That would, of course, change. The early 2000s saw an explosion in options and mounting sophistication.
Over the last couple of decades, the strides made in sophistication are amazing when it comes to these tools. They’ve, by and large, migrated to the web and then the cloud, supporting multi-tenant paradigms. Today, you’ll find all sorts of integration hooks, APIs, and entire marketplaces dedicated to extensibility. And of course, they have incredibly slick and polished GUIs compared to what I started out with. They’d have to, for the story, I just told to make sense. How else could an issue tracking tool support things like converting emails to issues or incorporating test run results into the workflow?
And yet, I’ve never seen one offer any kind of meaningful window into production—at least not until now.
A revolution in issue management: window into production
Just recently, Atlassian has announced that Jira will include feature flag integration. If you’re not familiar with feature flags (aka “feature toggles”), the idea is that they allow you to configure which users see what in your application without changing or deploying your code at all. (If you want a longer feature flag explanation, you can read in detail here.)
To understand the power of feature flags, let’s revisit the buggy login from my example. Let’s say that our software developer makes the change in question, and let’s further assume that, for whatever reason, that’s the only change in what will become a production patch. But now, let’s say that we’re nervous about disrupting any existing users with the patch to our web app.
So we introduce the fix with a feature flag that we must specifically toggle to “on” before anyone actually sees the fix. In other words, we deploy a patch whose only difference is that it contains code that says, “Unless we turn on this flag, do the same thing you’ve always done. Otherwise, try the code containing the fix.”
You push this code into production with the flag off and, of course, no new problems occur. Why would they? It’s the same code.
Then, you turn the fix on for a handful of users—specifically only those using Firefox on Android to access the site. If that solves the issue and creates no problems, you turn it on for all Firefox users. Then all Android users. And then everyone, making sure at every step that the fix causes no new issues.
This is the power of feature flags. And now, you can monitor how all of that is going right from within your original Jira issue, thanks to an integration between Jira and Rollout.io.
Continuous visibility is essential for the modern enterprise
Initially, this might seem more like a cool curiosity than a game-changer, but let the implications wash over you a bit. Today’s standard issue tracking is, after QA, we label the issue “done,” cross our fingers, and hope that nobody called us screaming about not being able to log in.
But now we can do something else entirely.
We can hold off calling the issue done until it passes a whole series of production checkpoints: rolled out to a small group, then rolled out to a larger group, then rolled out to everyone. We can monitor it every step of the way, seeing, right there in the issue, whether the new code is running and how many people are using it. And then, once everyone has been using it for a while, we can call it done.
Now, this won’t completely eliminate all uncertainty or completely solve the Schroedinger’s Box/Issue problem, but it gets us an awful lot closer… and with a lot more intelligence.
Modern companies have embraced movements like agile, lean, and DevOps in order to improve product and service delivery. This involves eliminating waste, building cross-functional teams, and knocking down organizational silos. And continuous visibility inside of issue tracking is the next logical step in this progression.
This integration of feature flagging and issue tracking makes product and production information highly visible to everyone within the IT organization, and all in one place. No longer do you have wireframes for the product people, dashboards for the operations people, and defect tracking for the developers. Instead, you have one issue that matters to everyone, provides intelligence to everyone and gets everyone on the same page for important decision-making.