Delivery Reimagined - Estimation

Product Delivery Playbook >> Planning and Tracking >> Estimation

For decades, software development teams have been trapped in a frustrating cycle. We spend countless hours breaking down projects into the smallest possible tasks, meticulously estimating each one in hours or days, and then watch as those precise calculations crumble at the first sign of unexpected complexity. Its a process that creates an illusion of certainty while delivering constant anxiety.

There’s a better way, and it’s less about complex spreadsheets and more about common sense. It’s called relative estimation, often using T-shirt sizes (XS, S, M, L, XL), and its more reliable and accurate because it embraces the inherent uncertainty of building something new.

To understand the difference, let's think about ordering dinner.

The Two Restaurant Approaches

Imagine you walk into a restaurant. You have two ways to order:

The Traditional Calculation Method:

You ignore the menu. You walk up to the chef and say, I want a dish with 150g of chicken breast, 80g of basmati rice, a sauce made with coconut milk, lemongrass, and ginger, plus a side of steamed broccoli. How much will that cost and exactly when will it be ready? The chef has to stop everything, calculate the cost of each raw ingredient, estimate the prep time for your unique request, and factor in how busy the kitchen is. The price and time you get will be a highly specific, but fragile, guess.

The Relative Sizing Method:

You look at the menu and order the Green Curry. The price is fixed. You know roughly what you're getting. The restaurant has made this dish hundreds of times. They know the average cost and time it takes to prepare, even with slight variations in chicken breast size or the intensity of a chili. You get a predictable outcome for a predictable cost.

Traditional software estimation is the first method. We ask teams for a precise number based on a list of ingredients, many of which we haven't even sourced yet. Relative T-shirt sizing is like ordering from the menu.

How T-Shirt Sizing Delivers More Accuracy

When we use traditional estimation, were asking for the impossible: a precise prediction of an unknown future. We force developers to commit to a specific number of hours for a task they haven't started yet. This leads to several problems:

False Precision: An estimate of 42 hours sounds more accurate than this is a Medium-sized task, but its not. It ignores unknowns, potential roadblocks, and the natural ebb and flow of creative work.
Defensive Padding: Because developers are held to these precise numbers, they often add significant padding to protect themselves, which distorts the timeline.
Wasted Time: Teams can spend more time debating whether a task is 12 hours or 16 hours than it would take to simply do the work.

Relative sizing fixes this by changing the question. We no longer ask, How long will this take? Instead, we ask, How big is this compared to other things we've done?

A team will have a shared understanding of what a Small, Medium, or Large task feels like based on their collective experience. A Small might be a simple bug fix, while a Large could be implementing a new feature with multiple components.

The process is about comparison, not calculation. The conversation becomes about complexity, risk, and uncertainty—the real factors that affect timelines. A developer might say, This feels like a Large to me because we have to integrate with that old system, which caused problems last time. This is a far more valuable discussion than arguing over a few hours.

Reliability Through Velocity

The true power of this approach emerges over time. A team might complete one Large, two Mediums, and three Small tasks in a two-week period. By assigning points to these sizes (e.g., S=2, M=5, L=8), they can calculate their velocity.

After a few cycles, this velocity becomes a remarkably reliable predictor of future work. If the team consistently completes around 25 points worth of work every two weeks, you can confidently forecast how long a backlog of 100 points will take. Youre no longer relying on a fragile, hour-by-hour project plan, but on demonstrated, historical performance.

It’s time to stop asking our teams to be fortune-tellers and instead empower them to be experienced chefs. Give them a menu of work sized by complexity, let them establish a rhythm, and they will deliver predictable, reliable results every time.

This is only a guide to get started. Once runing, the velocity will inform capacity

Note. Stories are taken into a sprint, while features are committed at quarterly planning

The Two Restaurant Approaches

The Traditional Calculation Method:

The Relative Sizing Method:

How T-Shirt Sizing Delivers More Accuracy

Reliability Through Velocity

Further reading;