Sunday, 22 September 2013

#NoEstimates

Once in a while I come across a host of different 'fads' which actually have something to them, but are sold as something completely different, often for what I consider are the wrong reasons or focus on the wrong things. This is like Viagra, which was created as something completely different, but has become synonymous with sex, become the butt of jokes and the epitome of junk mail amongst a host of other things. Indeed, back in the day, before people understood agility and as is the case with lean software development today, this was the same. Consider it the same as tech following Gartner's hype curve.

This time round, it is the turn of the 'No Estimates' school.

No estimates is a movement which seems to be sourced in the non-committal Kanban world which people assume to mean that no estimates are given for tasks. This is not actually true. The aim of the group is to move away from the concept of estimation as we know it. This includes the sizing of tasks by story points, and concentrating on counting cards. ThoughtWorks released an e-Book in 2009 about using story cards as a measure of velocity and throughput. I personally take this one step further and prefer to break tasks down into the smallest logical unit with the lowest variance. What I mean by this is that I prefer to play to the human strength of being better able to measure small things than large (in terms of variance of the actual metric from the expected metric).

This means that I personally much prefer to size things in single point items/stories. Larger tasks are then composed of these smaller subtasks, like Kanban in manufacturing composes larger parts from smaller ones. The lower variance means lower delivery risk and lower safety (read inventory) and pushes the team closer to the predictability afforded by Little's law as the safety margin factor to zero.

Why Smaller?

Consider a burn down chart of tasks. The burn down never actually follows the burn down path exactly. The nature of story sizes means that you will have an 8 point task move across the board and completing it will decrement the burn down by a discrete 'block' of points (8 in this case). So the best you can get is a stepped pattern, which in itself makes the variance larger than it needs to if the burn-down rate is taken as the 'ideal' baseline (note, a burn down chart is the 'ideal' model of how the work will decompose).

Why do you care? Because this stepped pattern introduces a variation of its own. This means that some times you will have slack, others you'll be rushing, all during the same project. This is all without the introduction of a variance on the size of the task at hand (as shown my a previous blog post on evolutionary estimation, often points don't actually reflect the relative effort in stories) which in themselves introduce a variance on this variance. The fabricated image below shows the variance on a burn down due to the step and when you consider the variation in the size of one point tasks, bracketed in the time periods at the bottom, this is the second variance due to the timings being out.

fig 1 - Burn down of the variation of both the 'steps' and the
delivery timing for different sized stories. The idealised burn down is shown in red (typical of tools like JIRA Agile).


Note, the blue line shows the top and bottom variance of the actual delivered timing (i.e. the green step function), not against the red burn down line. If the average were plotted on the above, the burn down 'trajectory' would sit above the red line, passing half way through the variation. So as of any moment, the project would look like it would be running late, but may not be. It's harder to tell with the combination of the variance of task size and time per task.

Reducing the size of stories to one point stories gets you closer and closer to the burn down line and gives you the consistent performance of the team, which will have a much narrower variance simply because of the use of a smaller unit of work per unit of time. The following example, which is the same data as in fig 1, just burning down by one point, shows that for this data, the variation is reduced, simply by making the story points a consistent size.

fig 2 - 1-point burn down chart showing shorter variation


The reduction in variation is 12 percent, which by proxy, increases the certainty, simply by sizing the tasks per epic differently. This reduction in variation reduces the variance around the throughput (which is story points per unit sprint/iteration). The only 'variable' you then have to worry about is the time a story point takes, which then simply becomes your now relatively predictable cycle time. 

The key with No Estimates, as should be apparent by now, is that it is an absolute misnomer.  They do estimate, but not as a forecast with many variables.

Why does this work?

There is a paper and pen game I play when explaining variance to people. I do this twice and for each go, I draw one of two lines. Firstly one short and one long, on a piece of paper and each time ask Joe/Jane Bloggs to estimate the size of the two lines on the paper. I then ask them to estimate how many longer lines can fit in the shorter one, by eye only. After all three steps are complete, I get a ruler and measure the lines. Usually, the longer line and combination are significantly off, even if the estimates of the short line is fairly good. Please do try this at home. 


fig 3 - Estimate the size of the smaller and latter, then estimate how many small tasks go into the latter.


As humans, we're rubbish...

...at estimating. Sometimes we're also rubbish at being humans, but that's another story. 

The problem arises because there are three variances to worry about. The first is how far out you are with the shorter line. When playing this game, most people are actually quite good at estimating the shorter line. For say, a 20mm line, most will go between 18mm and 21mm. The total variation is 3mm. That's 15 percent of the length of the line. 

With a longer line of 200mm say, most people are between 140mm and 240mm. A total variation of 100mm which is 50% of the line length. 

When the combination of these errors occurs, it is very rare that they are cancelled out altogether. However, the total error when performing the 20mm into the 200mm line effectively multiplies the error by at least 10 (as you take your smaller line measure by eye and apply it one after the other to measure the longer line, the error adds up) and on top of that, you have the error in estimating the big line, which means the total effect of the variances is a factor of the multiplication of the variance of the smaller line with the larger and not the addition. It's non-linear.

Note, the important thing isn't the actual size of the line. You first draw the line and you don't care how big it is. It's the deviation of the estimate from the actual size of the line that's important.

What's the point?

OK, granted, that joke's getting old. From my previous evolutionary estimation blog post, you can see that estimation is not a super-fast nor simple matter when trying to apply it to retrospective data. Indeed the vast majority of developers don't have the statistical background to be able to analyse the improvements they make to their estimation processes. By contrast, No Estimates aims to do away with the problem altogether by fixing the size of a story to one size. For example, what would have been a three point story in the old(er) world. In a way that's a good thing and intuitively relates better to the concept of a kanban container size, which holds a certain number of stories. In the software world this maps to the idea of an epic, or story with subtasks.

Conclusion, is what you said previously is 'pointless'?

Nope! Definitely not. Makes a good joke heading though.

The previous techniques I have used still apply, as the aim is to match the distribution in exactly the same way, just with one story size as opposed to the many that you have in other estimation techniques. Anything falling outside a normally distributed task could get 'chopped' into several story sized objects, or future pieces of work resized so each subtask is a story.

Just to reiterate, as I think it is worth mentioning again. Projects have never failed because of the estimates. They filed because of the difference between estimated and actual delivery times. That's your variation/variance. Reduce the variation, you increase predictability. Once you increase predictability, speed up and monitor that predictability. Then 'fix it' if it gets wide again. This is a continuous process, hence 'continuous improvement'.

2 comments:

  1. Very good post.
    You raise dome very important points about the choice of "size" of story, and the impact on the progress measurement.
    This impact is critical to help people make the right scope decisions, which is one aspect you don't cover here.

    All of these come together in what I define as #NoEstimates and explain in detail at NoEstimatesBook. Com

    ReplyDelete
  2. The difficulty I feel isn't with Estimates. It's how they are enforced or understood. As many in the #NoEstimates community acknowledge, people don't understand how to use estimates and what their limitations are in terms of certainty. However, this doesn't make estimates the problem, it makes the people a problem. Pretty much like most problems. To blame estimates themselves is like blaming gravity for someone falling over and hurting themselves if they weren't watching where they were going.

    The #NoEstimates movement has the potential to be amazing. However, the space it feels like it is being taken down, including by the forces of some of the proponents of the movement itself, doesn't feel like it will do anything but throw out the baby with the bathwater, introduces a distinction between things like 'forecasts' and 'estimates' (they are the same thing) and indeed, introduce a silo between those that can only emotionally function with numbers and those that don't. Shifting the stress doesn't really solve the problem. I've yet to be convinced in either direction though. I just don't want it to lead to my rejection of the principles due to the characters involved or perhaps the slightly less meaningful direction, because that's not how I work. I don't like rejecting something for the wrong reason. I don't care too much if the conversations are unpleasant. I just want them to be focused and 'right'.

    ReplyDelete

Whadda ya say?