Treat Defects Like Milk

As professional software developers we apply good technical practices with the goal of delivering high-value products with no defects. When defects do occur, high-performance agile teams deal with them quickly and effectively, preferably as soon as they are exposed. That's not to say that these teams fix all defects—sometimes, for example, the economics don’t favor spending effort to fix a particular defect.

Sadly, in spite of their good intentions, many organizations are swimming in a sea of defects. I won't go into the many causes of defects. In this post, I am more interested in how to manage defects once we become aware of their existence.

Typical Defect Categories

For the sake of discussion, let’s organize defects into three categories: must do now, important, and maybe someday.

Must-do-now defects are so important that when they occur we would be willing to interrupt the current sprint (perhaps even abort the current sprint) in order to fix them. A good example might be a defect that is in the critical production system through which our company generates the majority of its revenue. Delaying work on this defect would be economically foolish, so we work on it immediately. As a result, these defects never live long.

Important defects are just that: important, but not so important that we would interrupt or abort the current sprint to work on them. However, they need to be addressed soon, perhaps in the next sprint. Many teams will add these important defects to their product backlog, prioritized into the correct order based on other important items in the backlog. That approach makes sense because it forces a business decision about the defect, e.g., in the next sprint we can work on Features G, H, I or we can work on Features G and H and fix the important defect.

Maybe-someday defects are non-critical, low-impact defects. The best agile teams address these as they became known. However, I've seen many teams that are not as diligent; before long, the defects start to pile up. What should we do with them?

First, we will need to decide where we should store them. If there are more than a few defects, putting them in the product backlog would be a problem. Most teams like to keep their product backlog to no more than 150 items, so putting a larger quantity of lower-priority defects in the product backlog would quickly grow the backlog to quite a bit more than 150 items. So perhaps these maybe-someday defects are stored in the defect tracking system. But letting them continue to mount up in that system shows not only a lack of professional-software-development discipline, but also creates an ever growing list that takes more and more time to manage. As the list size increases, the transaction cost to manage the list increases (it takes more time to manage a large list than a small list). In addition, defects in the defect tracking system begin to age!

Have you ever worked on a product where there were known defects that existed in the product for four, five, or more years? Much to my embarrassment I have :-(  Can we agree that if we have a defect in a system that has been there for that many years we are very unlikely to ever fix that defect?  And, after so much time, fixing that defect might actually be a bad idea—we might break everyone’s workaround for the defect. Then we get a flood of support queries asking what we did. We reply, “We fixed the defect.” Customers might then say, “Why did you do that? Now I have to update my workaround procedures!”

Defects Should Expire

I think we ought to treat all defects like we treat milk—we should stamp them with an expiration date! If you haven’t fixed a defect by its expiration date, consider it stale and just throw it out! If it is truly an important defect and worth your time, it is sure to get re-submitted. If it does, you will at least know it is still worth fixing—perhaps that will prompt you to increase its importance and actually fix it.

As an improvement strategy, over time, use a shorter expiration date (one that becomes increasingly closer to the date that the defect was detected). This approach will motivate the desired behavior of never letting a defect live long. Perhaps you will even conclude that if a defect lives for more than one day (or one sprint) then maybe it really isn’t worth fixing at all and you will just delete it!