Sunday, October 31, 2010

The Uncertainty Principle in Software Engineering

Way back in 1996, three computer scientists, Hadar Ziv, Debra J Richardson, and Rene Klosch, wrote a paper that should be better known than it is. It's called The Uncertainty Principle in Software Engineering. Many have shortened the reference to Ziv's Uncertainty Principle. It states that "uncertainty is inherent and inevitable in software development processes and products". This principle sheds light on why Waterfall is well intentioned but flawed as a development methodology and Agile is better suited to deal with the uncertainty in software development.

Ziv's Uncertainty Principle models uncertainty in software engineering using Bayesian Belief Networks. In the software development world, Bayesian nets are most commonly known in its use in search algorithms used on large volumes of text and hypertext.

The authors focused on five areas of software engineering to demonstrate uncertainty;
  1. requirements analysis
  2. transition from requirements to design and coding
  3. software re-engineering
  4. software reuse
  5. software testing

The authors also provided three example sources of uncertainty of which below are my paraphrased descriptions of each:
  1. Uncertainty in the problem domain: The problem, for which an application is developed, exists in the real world. We all know the real world has many uncertainties, many of which are not, and/or cannot, be addressed by the application being developed.
  2. Uncertainty in the solution domain: Building the application itself introduces uncertainty beyond the uncertainties in the problem domain. The example used in the paper is the act of debugging of race conditions from concurrent use. There is uncertainty in the exact conditions that cause it as well as how to observe the condition for reproduction. The authors use Heisenberg's uncertainty principle as similar a affect where by the mere attempt at observing an environment will change it. If you've had to debug a problem with concurrent use, then I'm sure you see this connection.
  3. Human participation: Human involvement introduces uncertainty through business logic built into the application. Business logic coded into an application do no typically address explicit uncertainties. We code mostly based on certainty - not uncertainty.
The point is that attempting to address in advance all of the potential situations and conditions that will be faced in production is futile. The most effective way of dealing with the inevitable uncertainties is to get the application in the hands of the users as soon as possible and let the real world tell us what needs to change. This doesn't mean production alone. It means putting the application in the hands of users who can use the application in real world circumstances.

Waterfall makes the noble, yet inherently flawed, attempt at making the application as rock solid as possible on paper before commencing development. Agile's principle of frequent inspection requires that the team deliver working code frequently. Agile accepts and embraces the fact that once in production, the real world will show that significant imagined truths will be deemed either false or incomplete thus requiring non-trivial modifications and enhancements. These changes are most effectively implemented when features are introduced and not when the application is fully developed.

Saturday, October 30, 2010

Another Explanation of Story Points

Story points seem to be the most misunderstood concept within Agile so I'm going to take my stab at helping others to understand what they are and their importance.

The typical question is: Why estimate user stories with these ambiguous and arbitrary things call story points when I can use hours which inherently make more sense to me?

Hours may make sense over story points at this very moment, but hopefully I'll be able to change your mind after you complete reading this article.

The major impediment to using hours is the variability between people and teams. I recently heard Jeff Sutherland state that a Yale University study has shown that what will take the best developer one hour to complete, the worst developer needs 10 hours to complete. When comparing best and worst teams that grows by an order of magnitude so an hour of the best teams translates to 2000 hour for the worst teams. The variability of hours is a real impediment and is typically a huge time consuming task on most projects.

Story points simplifies the estimation process by taking the developer/team variability out of the process by assigning a level of complexity to user stories. Many in the Agile community use a subset of the Fibonacci sequence as the units of measure; 1/2, 1, 2, 3, 5, 8, 13, 21, 34 and 45 per user story respectively. Think of story points as a more flexible version of assigning estimates as small, medium or large.

The story point estimation process begins by the team selecting the smallest user story and mutually agreeing on its complexity by assigning it a number from the Fibonacci subset of numbers. This user story is called the "keystone", meaning all other user story estimates are based on relative complexity to the keystone user story.

The byproduct of this is that upon sprint completion the team is able to report to the Product Owner in a more concise measurement of productivity - number of story points completed. The Product Owner sees things through the lens of project stories, features, etc. It makes more sense to communicate velocity, i.e. sprint productivity, in the form of story points than hours. Story points translate directly to what is manifested on screen when the working code is demonstrated and hours do not.

Here is an example of using hours as a measurement of sprint effectiveness. A team of four developers work on a project for a two week sprint. At 40hrs per week, the four developers have 320 hours of available time to work during the sprint. However, once the sprint is complete, their velocity shows that they only accomplished 60 hours of work based on the estimated hours assigned to the user stories they completed. That is a misleading depiction of productivity that leads to irrational conversation about whether the team is working hard enough.

Using story points, the team's velocity is expressed in terms of productivity related to the complexity of the tasks completed. For example, let's say the Product Owner knows in advance of the sprint that the team's goal is to complete 35 story points. At sprint completion, the effectiveness of the sprint is measured against the sprint goal and previous sprint velocities. Story point velocity is used to view the overall project productivity in terms of acceleration/deceleration, i.e. is velocity increasing or decreasing.

The Product Owner can also easily translate velocity into an estimated release date more easily than with estimated hours. As an example, if there are 350 total story points remaining in the product backlog, and the velocity of the team is at 35 story points per sprint, then the product backlog will be exhausted in 10 sprints.

Lastly, estimating projects is not only faster but more accurate than estimating in hours! Jeff Sutherland references two experiences related to this in one of his blog articles (read it here).

I hope this clears up what a story point is, why we use them, and provides a compelling reason to transition from hours to points.

Web Analytics