Tuesday, August 28, 2007

Defects On Sale!

Today after our planning game I did a short poll on how the guys perceive test driven development and pair programming. We're trying to do both for some time now, and since I take the blame for introducing both practices, I feel I'm somewhat - um - preoccupied on that matter. A few days ago, I was caught totally off guard when Richard told me that, well, he doesn't believe programming in pairs is more productive. Bummer. And I had believed my show to be grand circus.

Coming down to earth from my alien space shuttle of imaginative knowledge in the face of uncertainty I realized that I didn't really have a clue what my teammates thought about our recent process improvement tactics. I figured the easiest way to find out would be to ask them. So I did a short poll. There were six people. Including me. I asked four questions:

Question Yes! No!
Do you think TDD makes you more productive? 3 3
Do you think TDD leads to better quality? 6 0
Do you think pair programming makes you more productive? 3 3
Do you think pair programming leads to better quality? 6 0

Now this is an interesting bite from the apple of knowledge: while we all seem to agree that pair programming and TDD increase code quality, half of the guys thinks that this raise in quality comes with a cost in overall productivity. Unfortunately shooting them with my nerf gun didn't help to teach them reason, so I concluded that the half I am in may be wrong. Perhaps.

But since I usually don't give in that fast I pondered over this anomaly of perception during our two-years-wedding-anniversary-dinner. While I munched down a deliciously flavorsome tenderloin, Anna proposed that maybe if you believe that TDD and pair programming don't increase productivity you don't expect to make any errors. While the implication would be true, the poll's data seems to suggest that all of the guys think that the practices improve quality - which implies that they expect to make errors.

So when we arrive at a point where we are self-conscious enough about our code to expect ourselves to err frequently, a simple question remains:

What Is The Relation Between Quality And Effort?

This is where a little math may help... Let's define the overall effort of a feature as the effort it takes to produce a certain function in lines of code (how crude!) plus the effort to fix the expected errors. The oversimplified measure of programming tasks in lines of code is, of course, questionable to the degree of calling it excrement of horned mammals. On the other hand it allows me to do a quick-and-dirty wort-case pi times thumb calculation.

effort(feature) ->
codingEffort(linesOfCode(feature)) +

Let's further simplify (yuk) that the coding effort is defined as directly proportional to the lines of code of the feature:

codingEffort(numberOfLines) ->
codingEffortPerLine * numberOfLines

Excessive googling (and IEEEing) informs us that the defect rate is normally defined as defects per thousand lines of code. So without test driving my functions I'd expect the expected fixing effort to be something along the lines of:

expectedFixingEffort(numberOfLines) ->
fixingEffortPerDefect * (defectRate / 1000) * numberOfLines

But where does this lead? Good question. My answer is even more assumptions: Perhaps we can agree that if we make errors (and we do, don't we) introducing practices that increase quality allows us to exchange coding effort (up-front-effort) with fixing effort. If you read carefully, perhaps you ask whether I may exchange effort for cost arbitrarily... well, technically, no, but since I'm a software developer the Flying Spaghetti Monster may smile forgivingly onto my unworthy soul.

For example, when I do pair programming and my partner finds an error that I didn't see, the effort of this lapse is about:

  1. "hey, shouldn't that read '>=' instead of '>'?"

  2. "oh, yeah, 'course"

  3. *clickety-click*

-- 3 seconds --

When such a defect is not found until the product is in the field, the effort of fixing the error is:

  1. Cost of the error for the customer (lost money, lost customers, being angry, beating up the pup)

  2. Reporting the error to the provider

  3. Checking the error logs and dealing with the customer

  4. Reporting the error to our hotline

  5. Checking the error at our site and finding out what the error really is

  6. Reporting the error to our development

  7. Prioritizing the error

  8. Trying to reproduce the error and find out what the customer really did

  9. Finding the error

  10. Fixing the error

  11. Building a new patch-release

  12. Testing the patch-release

  13. Getting the patch-release approved by the customer

  14. Updating the life-units with a certain probability of update-death

  15. (More indirect cost due to loss of trust, etc)

-- um, more than 3 seconds, definitely --

I think it is not presumptuous to claim that increasing quality may also increase overall productivity if the expected effort to fix an error is high enough with regards to the expected decrease of errors due to better quality. The refined question is

What does a worst case error effort scenario look like in the break-even point of quality against productivity?

Let's assume we know a practice that increases our coding effort by a factor (additionalEffort > 1) and improves our error rate by a different factor (defectRateImprovement in [0;1[). For the practice to be effort efficient the overall effort without implementing this practice must be greater than the overall effort when using the practice. Using the already defined formulas this yields:

(codingEffortPerLine * numberOfLines) +
(fixingEffortPerDefect * (defectRate / 1000) *
(additionalEffort * codingEffortPerLine * numberOfLines) +
(fixingEffortPerDefect *
(defectRate * defectRateImprovement / 1000) *

Tackling this equation with a load of 7-th grade mathematics gives:

fixingEffortPerDefect * (defectRate / 1000) *
(1 - defectRateImprovement)
codingEffortPerLine * (additionalEffort - 1)

Should this innocent looking inequation be close enough to reality to make any sense, we could conclude that

  • After you cut down the defect rate by a factor of two, cutting it by yet another factor of two would require twice the opportunity cost. Which means that halving your defect rate gets more and more expensive with regards to the opportunity cost of letting the defect go wild.

  • If you know your current defect rate and your current price per defect, you can guess whether the defect reducing effort spent for a certain practice will be cost efficient. Of course a practice may and probably will have other impacts. But that's a different bed-time story. Featuring a hungry gorilla and a beautiful princess.

Now that we've got a nice equation we can torment it with some values, fed to our greedy mouths by the power of the Flying Spaghetti Monster. Let's assume that we have a defect rate of 20 defects per 1000 lines of code (which a google search reveals to be considered somewhat "normal"). Let's now assume that our practice increases coding effort by a factor of 2 (which is the worst case for pair programming, obviously). Let's further assume that this will find one tenth of all errors directly when they're implemented (fixing the errors in this phase is covered easily by the effort factor of 2). Watch and behold 3rd grade maths:

fixingEffortPerDefect * (20 / 1000) * (1 - 0.9)
codingEffortPerLine * (2 - 1)

... or ...

fixingEffortPerDefect > codingEffortPerLine * 500

This means that for a defect rate of 20 errors per 1000 lines of code using a practice that doubles your coding effort and finds a tenth of the errors during coding will save you some bucks if the expected effort of fixing an error is more than 500 times the effort of writing a single line of code.

If you want even more numbers, let's further assume that in C++ you need 60 lines of code per function point (now we get really braggy) and that you can somehow earn $200 per function point, this means that our practice lowers overall cost if the expected price per defect is greater than about $1600.

It all boils down to this: If you work in an environment where the average price per defect found outside the holy halls of your development team is greater than 2000 bucks, introducing a technique that doubles the coding effort to prevent a tenth of the errors will reduce development cost and thusly increase productivity. Well, if I really did a worst case analysis and didn't mess up the seventh grade maths up there, that is.

Do you think a total expected cost of $2000 per defect is a lot? Does this apply to your work environment? Do you actually have any clue how much your favorite defect is today?

1 comment:

  1. Dave,

    thank you for your interesting links. I believe that all you say is true.

    I was trying to make a different point, though. I tried to extract the effect of one practice with regards to the defect rate and the expected cost of a defect.

    I include the worst-case up-front additional cost for introducing a new practice in the coding effort factor. I use a worst-case defect-rate improvement to extract a pi times thumb number for the worst case cost of a defect for which the practice would be cost efficient /just by looking at it's impact on the defect rate/.

    If it is already cost efficient when I look at the defect rate, then I don't need to look further. The effects I ignore can only /improve/ the practice's pay-off, this is why I ignore them - because I believe that in many environments a small impact on the defect rate alone improves productivity.

    The interesting point you make is the idea of time-to-market. One could argue that this cannot possibly be included in the "worst-case" part of my crude model. On the other hand I don't yet believe that you really "loose" time to market with a lower defect rate, since most probably the customer will find some of the errors and demands that you fix them before going life.