Estimation Is Hard. Here Is Why You Keep Getting It Wrong.
by Arif Ikhsanudin, Backend Developer
The Estimate That Seemed Right
"This should take three days." You know the feature area, you've built similar things before, and three days feels right. On day three, you're about 70% done. On day five, you're dealing with an edge case that required a data migration you hadn't anticipated. On day seven, the feature ships. The estimate was wrong by 130%.
This is not unusual. The average software project overruns its time estimate by 40-200%. The distribution is heavily skewed — the optimistic scenario is rarely realized, but the pessimistic tails are always reachable. If you're consistently surprised by how long things take, you're not making individual estimation errors. You're missing structural patterns.
The Planning Fallacy
Daniel Kahneman documented the planning fallacy: people systematically underestimate the time, cost, and risks of future actions, even when they have direct knowledge of similar past actions going over time and budget.
The mechanism: when estimating, we construct a best-case narrative of how the work will go. We imagine the steps in sequence, roughly how long each takes, and arrive at a total. What we don't model is the variability: the edge case we'll discover mid-implementation, the dependency that isn't ready, the review cycle that takes longer, the context switch that breaks our flow for a day, the production incident that consumes two days in the middle of the sprint.
These interruptions are not exceptional. They happen consistently, in roughly predictable proportions. But because they're interruptions — not part of the planned work — they're systematically excluded from estimates.
The Specific Things You Aren't Counting
An honest inventory of what goes into software work, beyond the implementation itself:
Requirements clarification: Back-and-forth on ambiguous requirements. Usually a day or more per feature, invisible in estimates.
Technical investigation: Exploring an unfamiliar part of the codebase, understanding a dependency, researching an approach. Usually 20-30% of implementation time, often not estimated separately.
Integration and testing: Time spent making the implementation work with real dependencies, debugging integration issues, covering edge cases in tests. Often underestimated by 50%.
Code review cycles: The time from PR opened to PR merged, including the back-and-forth of review comments and fixes. 1-3 days for non-trivial changes in active teams.
Deploy and verification: Deploying to staging, testing, deploying to production, monitoring for issues. Often not estimated at all.
Discovery of unexpected complexity: The thing you didn't know you didn't know. Happens on most non-trivial features. By definition, you can't estimate for specific unknowns, but you can budget for them.
A reasonable rule of thumb: double your gut estimate for the implementation, then add 30% for everything else. This gets you into the right range more often than the gut estimate alone.
The Reference Class Problem
The most reliable way to estimate is to find historical data on similar past work. "We estimated this class of feature at 3 days last quarter; it took 7. We estimated a similar one at 5 days; it took 9. So 3 days of gut estimate probably means 6-8 days." Reference class forecasting — forecasting from the distribution of similar past outcomes rather than from a constructed narrative of the current task — consistently outperforms bottom-up estimation.
This requires tracking actual outcomes, not just estimates. If your team doesn't currently record actual time-to-complete alongside estimates, start. Six months of data produces significantly better estimates than unaided intuition.
Communicating Uncertainty Honestly
Estimates are not commitments. Communicating them as point estimates — "this will take three days" — implies precision that doesn't exist. Communicating them as ranges — "this will take between three and seven days, most likely around five" — is more honest and more useful for planning.
When stakeholders push back on ranges ("I need a number"), the appropriate response is to explain what drives the uncertainty. "The core implementation is three days. The uncertainty is around the integration with the third-party API — if their documentation is accurate, it's two more days; if we hit the issues we've seen with similar integrations, it's four." This gives the stakeholder actionable information: either accept the uncertainty or invest in reducing it (by doing a spike, by getting answers from the API provider).
The Practical Takeaway
For your next estimate, write down your gut number. Then spend ten minutes listing the specific things that could make this take longer than expected. Adjust based on the list. Compare your final estimate to the gut number. If they're the same, you haven't looked for the risks carefully enough. Track the outcome, compare it to both numbers. Over time, you'll learn your own systematic biases and correct for them.