Skip to main content

PEZ -or- Why one shouldn't track Scrum Tasks in Hours

I've been called to task - asked why I teach teams to distinguish between Task Hours and some non-denominational non-specific non-absolute unit.  The latest team I guided used PEZ!  As in the daily stand up a person might be heard to say:
"Yesterday I was working on the code to DisCombobulate the GUID so that we could distinguish customers from guest and the task to peek into the hashed session state was harder than I anticipated - I will need 2 more PEZ to get that done today."

Benefits of "Task Points"

Because it works and is more fun.  Not a sufficient and rational reason.

Because it's a fractal of the concept of Story Points.  Not a practical enough reason.

Because it encourages transparency and acknowledges that we are really bad at estimates.  Not sure we wish to propagate that notion.

Because Scrum has a value of Openness and in that sprit - can we admit that a team of 7 people doing 50 "hours" of tasks a week is really a poor metric.   Is the metric - wrong - or is it being perceived poorly - or do we just not know what we have measured?

Have we truly measured the sprint progress in the units of HOURS? You know that aggregate of 60 minutes.  That man-made concept of time.  That universal unit that all societies agree upon, even the Imperial Unit based USA.

Burndown in Hours
Burndown in PEZ
Let's take a look at a team's burndown, using traditional hours.  It shows that in the first week of the sprint they accomplished 50 hours of work.  But there is 7 people working for the week.  Generally speaking that's 7 X 40 hr/week = 280 hrs.  Now many people like to point out that there is not 8 hrs of work in a typical work day.  Many people believe that every day contains a few hours of non-productive hours.  Many believe there are 6 hrs of productive time in a typical day (7 X (6hr X 5days) = 210 hrs).  What can we deduce from our burndown?  Maybe what we've been estimating and burning down are not this well know unit of time called an hour.

We have been estimating in hours - so it must be hours.  This is a myth. The analysis at RescueTime have show this myth to be busted by looking at the habits of user of their tracking tools. Their tools calculate a productivity rating in additions to hours on tasks.  To spoil the read... "And with an average productivity pulse of 53% for the year, that means we only have 12.5 hours a week to do productive work."

So maybe that study puts an end to the concept of Ideal Engineering Hours.  I've been involved in these dialogues and the general consensus is 5 - 6 ideal engineering hours per day.  It does not quite jive with what a real study finds (12.5 / 5 approx. 2 - 3 hr/day).

If a team of 7 should be doing something like 200 hrs of productive work a week - then what do we think of a team of 7 only completing 50 hrs in a week?  Well is that rational?  No.  So what's wrong with this irrational logic?  One could blame the individuals on the team and try to improve their ability to estimate tasks.  Yet if they are off by a factor of 4 - how much work will go into improving their poor estimation technique?  And what about this empirical measurement technique?  If we believe in the empirical technique then does the accuracy of our estimates matter?

My belief in transparency (openness) has me concerned that if we hang a chart on the wall that professes that the team completes 50 hours in a week - that someone is not telling the truth.  It is obvious if one observes the team, as a Team Agility Guide, might do and see that the team members are working on the sprint stories for the greatest part of every day and working diligently then these units might just be the problem.  When the estimate is made in good faith, but doesn't equate to what the wall clock tells.  Something is telling a fib, a lie.  Let's quit lying and make our burndown truthful.

One technique is to realize the empirical measurement does not require hours to work, choose any unit you wish - try PEZ.  And then one can correlate the PEZ to work hours.  And stop talking about hours that require half a day of work to complete.

Mike Cohn's critique of "Task Points"

I've read and reread Mike's critique and advice, but I don't get his point.  He finds tremendous value in Story Points, and notes that Task Points have similar disadvantages:
  • A foreign concept to many team members
  • The need to establish baseline values against which relative estimating can begin
  • A concern that estimates drift over time in comparison to the original relative values
I will share my observation - that the many teams that already understand relative points concepts and why they are beneficial do not have a new or foreign concept to learn.  No baseline values are required to begin.  Just simply change the unit one labels the task estimate from "hour" to "PEZ"; there no baseline mombojumbo.  I'm to ignorant to understand the 3rd disadvantage; seems to me this is possibly a great thing, a result of people learning.

Perceived Benefits of Task points

How does having a team track remaining effort on work (story or task) benefit the deliver of value? Great question. Will a list of reasons do?

  1. It moves the focus from how much effort - to how much effort is remaining - imagine the sunk cost fallacy;
  2. when an item takes longer that your typical inspection cadence (sprint for a story; day for a task) it requires a explicit reestimation and recommitment to the item getting to done;
  3. when the list of estimates on an item grows long (say past 2 or 3) it is a wonderful indicator of an impediment / when a task (should be much less than a day) last in-process for a week it is a great indicator that help is needed - training a team to hold members accountable is easier with a visual signal
  4. when the item is done-but... and a person wants to mark a zero, but ... it's a great indication of a newly discovered task
  5. I suppose the metric (burndown) is more accurate to reality than an ALL or Nothing assessment of tasks/stories effort.
  6. the practice reinforces the inspect/adapt micro cycle; and someone once told me - if it's hard to do; do it more frequently.
  7. ... there may be others ...

Is it useful to track remaining effort at a task level? Yes and No; it's situational. 

Let's first define the No situation, when the team is performant, gets stories done easily, has a known velocity, no problem planning and achieving a sprint commitment.

Now the alternative situation... is it useful... I think so.

See Also:

Don’t Estimate the Sprint Backlog Using Task Points by Mike Cohn


Most Popular on Agile Complexification Inverter

David's notes on "Drive"

- "The Surprising Truth about what Motivates Us" by Dan Pink.

Amazon book order
What I notice first and really like is the subtle implication in the shadow of the "i" in Drive is a person taking one step in a running motion.  This brings to mind the old saying - "there is no I in TEAM".  There is however a ME in TEAM, and there is an I in DRIVE.  And when one talks about motivating a team or an individual - it all starts with - what's in it for me.


Pink starts with an early experiment with monkeys on problem solving.  Seems the monkeys were much better problem solver's than the scientist thought they should be.  This 1949 experiment is explained as the early understanding of motivation.  At the time there were two main drivers of motivation:  biological & external influences.  Harry F. Harlow defines the third drive in a novel theory:  "The performance of the task provided intrinsic reward" (p 3).  This is Dan Pink's M…

Exercise:: Definition of Ready & Done

Assuming you are on a Scrum/Agile software development team, then one of the first 'working agreements' you have created with your team is a 'Definition of Done' - right?

Oh - you don't have a definition of what aspects a user story that is done will exhibit. Well then, you need to create a list of attributes of a done story. One way to do this would be to Google 'definition of done' ... here let me do that for you: Then you could just use someone else's definition - there DONE!

But that would be cheating -- right? It is not the artifact - the list of done criteria, that is important for your team - it is the act of doing it for themselves, it is that shared understanding of having a debate over some of the gray areas that create a true working agreement. If some of the team believes that a story being done means that there can be no bugs found in the code - but some believe that there can be some minor issues - well, …

Do You Put “CSM” After Your Name?

I’ve noticed a new trend—people have been gaining titles. When I was younger, only doctors had initials (like MD) after their names. I always figured that was because society held doctors, and sometime priests (OFM) in such high regard that we wanted to point out their higher learning. I hope it was to encourage others to apply themselves in school and become doctors also. Could it have been boastful?

The Wikipedia describes these “post-nominal initials”:
Post-nominal letters, also called post-nominal initials, are letters placed after the name of a person to indicate that the individual holds a position, educational degree, accreditation, office, or honor. An individual may use several different sets of post-nominal letters. The order in which these are listed after a name is based on the order of precedence and category of the order. That’s good enough for me.
So I ask you: is the use of CSM or CSP an appropriate use of post-nominal initials?
If your not an agilista, you may wonder …

Agile Story Estimation via Dog Grooming Exercise

Practice story estimation techniques with this exercise in dog grooming.

Related Post:
Affinity Estimating: A How-To by Sterling Barton.
Dogfood David why I feel like an expert in the concept of eating one's own dogfood.
   Slideshare:  Affinity Estimation - Size 60 Stories in about 20 Minutes.
For each dog below, estimate the work effort (size) required to groom the dog.  Assuming that you have the tools and experience to groom dogs.  Grooming includes washing, drying, combing, nail clipping, and hair triming in some cases.

Start with the ever popular:
Golden Retriever (22-24 in, 50-90 lbs).

The short haired Dachshund (15-28 lbs).

The Standard Poodle (15-18in, 40-80 lbs).

Bernese Mountain Dog (25-28 in., 65-120 lbs).

German Shepherd (23-26 in, 50-90 lbs).

Yorkshire terrier (5 in, <10 lbs).

Beagle (13-16 in, 18-35 lbs).

Boxer (26-31 in, 55-110 lbs).

Bulldog (40-55 lbs).

Labrador Retriever (21-25 in, 55-130 lbs).

Great Dane (28-38 in, 120-200 lbs).

Komondor (25-32 in, 90-130 lbs).

Situational Leadership II Model & Theory

Have you ever been in a situation where you thought the technique needed to move forward was one thing, yet the person leading (your leader) assumed something else was what was needed?  Did you feel misaligned, unheard, marginalized?  Would you believe that 54% of all leaders only use ONE style of leadership - regardless of the situation?  Does that one style of leading work well for the many levels of development we see on a team?

Perhaps your team should investigate one of the most widely used leadership models in the world ("used to train over 5 million managers in the world’s most respected organizations").  And it's not just for the leaders.  The training is most effective when everyone receives the training and uses the model.  The use of a ubiquitous language on your team is a collaboration accelerator.  When everyone is using the same mental model, speaking the same vernacular hours of frustration and discussion may be curtailed, and alignment achieved, outcomes …