Planning Under Uncertainty

Consider two bike journeys: your daily commute to work versus a cycling adventure through Patagonia. For your commute, you can estimate arrival time within minutes based on experience. Planning a multi-week expedition through unpredictable terrain requires a completely different approach: flexibility, contingency, and the ability to adapt as conditions change.

Enterprise planning works the same way. Simple or complicated tasks are like the daily commute: predictable, easy to estimate. Complex initiatives are the Patagonia adventure: adaptive planning that accepts uncertainty rather than fights it. In AME3, planning happens at every level. A Team planning its next Match. An Arena Owner ordering the Arena Backlog. Accountable Representatives estimating strategic Goals in the Tournament. The principles in this chapter apply to all of them.

Planning in the Simple and Complicated Space

Simple and complicated tasks are predictable, repeatable, and easily estimated. Once you understand them, they become a no-brainer.

Most enterprises handle simple and complicated work reasonably well. The real challenge lies elsewhere: as markets evolve faster and uncertainty grows, the complex space continuously expands. This is where traditional estimation, prediction, and planning methods break down.

Planning in the Complex Space

The Patagonia adventure represents this complex space. Every day brings new terrain, unexpected weather, and unforeseen challenges. This is where simple, time-based planning fails and adaptive approaches become essential.

Scrum is designed for the complex space where time-based planning breaks down

The challenges of planning in uncertain environments are not new. The Agile community has developed specific practices for planning and estimating in complex product development: iterative planning, relative estimation, and velocity-based forecasting. These techniques embrace uncertainty rather than pretend it does not exist.

“No plan survives first contact with the enemy.” — Helmuth von Moltke (paraphrased)

But long before the Agile Manifesto, military strategists understood the fundamental problem. Helmuth von Moltke, a Prussian field marshal, captured this reality in 1871 when he stated: “No plan of operations extends with any certainty beyond the first encounter with the main enemy forces.”

In business, your “enemy” might be competitors in the market, emerging technologies like AI, or simply the inherent complexity of product development. The Agile Manifesto addresses this directly: “responding to change over following a plan.”

This does not mean plans have no value. It means we must prioritize adaptability over rigid adherence to predetermined paths.

The more complex the task, the greater the deviation from the plan and the lower the accuracy of estimates. This is not a failure of planning. It is a characteristic of complex work.

The Accuracy Illusion

Humans are optimists. Well, that is why we are human. But this creates a cognitive bias in estimation.

Moreover, the relationship between effort spent sizing work and forecast accuracy is not linear. It follows a saturation curve. After a certain point, spending more time does not improve accuracy. In fact, it can decrease accuracy as people influence each other.

In short, when someone asks “how accurate is your forecast?” most people answer with an overly optimistic number. The Overconfidence Bias makes us systematically overestimate the accuracy of our own judgments. In reality, it is much worse.

The practical conclusion: invest less time in sizing.

The probability that you are over-investing in detailed projections and not gaining better accuracy is high. When choosing between detailed techniques or simpler approaches, err on the side of simplicity.

Planning Is Often an Excuse Not to Start

Humans naturally avoid risk. This makes evolutionary sense: why take unnecessary risks? In complex environments where uncertainty is high, this tendency becomes stronger.

Today’s risk in business is typically a difficult conversation with your manager, not physical danger. Yet we still avoid starting.

Several perspectives help overcome this paralysis:

Planning and estimation must be paid for. The time spent on detailed plans and estimates consumes budget. Often, direct implementation would provide more certainty faster.

Only implementation provides absolute certainty. Each implementation validates or invalidates a hypothesis.

Experiments can reduce risk before full commitment. When the cost and risk of direct implementation feel too high, a targeted experiment can provide just enough evidence to decide whether to proceed. The key question is always: is the experiment cheaper than the risk it eliminates?

The Anticipate–Advance–Assess loop for running experiments to validate a hypothesis

The following example shows how a SaaS product team might structure an experiment before committing to a full Conversation Oriented feature build.

PhaseStepExample
AnticipateHypothesisB2B project management users in software teams spend significant time navigating menus and filling forms. They would complete more work per session if they could manage their backlog, status updates, and queries through natural language.
AnticipateHow to validate?Run an 8-week opt-in beta for 5% of existing users. Release a conversational side panel, no new backend logic, only a natural language layer over existing APIs.
AdvanceRun ExperimentRecruit the highest-activity user segment. Users create, update, assign, and query items using natural language for the duration of the beta.
AdvanceMeasureWeekly active usage rate of the conversational interface. Task completion time for three benchmark workflows: create task with assignee and due date, move item to done, query “what is overdue.”
AssessConclusion?Proceed if at least 25% of beta users complete one task per week via conversation by week 4 and benchmark time drops by 30%. Stop if users revert to the traditional UI after the first session or report that AI responses require more correction than the form-based flow.

Note that this experiment still carries cost. If the team is confident enough in user demand, shipping a minimal version directly might provide more certainty faster, at lower total cost than eight weeks of instrumented beta.

Scrum and AME3 are fundamentally empirical control processes. Arena Backlog entries are hypotheses. Advancing in a Match or Sprint generates data and validates them.

Different Methods for Each Level

Do not try to create a single sizing hierarchy from enterprise goals down to individual tasks. This classical project structure plan thinking works in simple environments but fails in complex ones where changes happen too fast.

Instead, use lightweight methods at each organizational level that provide sufficient accuracy with minimal effort:

Enterprise level - Strategic goals and initiatives

Spider diagram comparing a new project against a reference project across five dimensions: Relevance, Benefit, Impact, Complexity, and CAPEX. Pseudo-Fibonacci scale 1–20.

Product/Backlog level - Features and stories

  • Relative sizing (Story Points with planning poker or bucket sizing)
  • Velocity-based forecasting
  • No sizing (counting items)

Task level - Daily activities

  • Time-based projections using natural time boxes (days)
  • Pull-based planning

Each level has its own reference frame and planning horizon. Do not try to roll up task projections into story points, and do not try to size enterprise initiatives by detailing them in backlog or project plans.

Choose the Right Reference

Relative sizing works because humans find it easier to compare things than to measure them absolutely.

Time is actually a poor reference for many situations:

  • How long will a colleague take?
  • How long will I take tomorrow when I am more experienced?
  • It is like the ur-meter in Paris changing its length all the time. For a precise measurement, you need an even more precise reference.

Better references include:

Other work items - Planning Poker and bucket sizing compare stories to each other, not to abstract time measures.

Metaphors - Using animals (elephant, cat, mouse) creates mental images. The brain processes relative sizes more naturally than abstract numbers.

Completed work - What we have already delivered provides the most reliable reference frame.

The key is that your reference stays stable even as your team’s capability evolves. A story sized as 5 remains larger than a story sized as 1, regardless of how fast the team delivers either.

Principles for Working with Forecasts

Forecast accuracy converges over time. The range of optimistic and pessimistic projections narrows as the team gains real data from completed Matches.

Several principles guide effective forecasting in complex environments:

  • Never use forecasts to measure performance. What you measure is what you get. If you measure team performance using velocity, teams will inflate their numbers. Forecasts should only inform planning, never drive evaluation, rewards, or punishments.
  • Prefer relative sizing over absolute time. Relative sizing with abstract numbers (like Story Points) keeps projections stable longer. The five-point item remains five times larger than the one-point item, even as the team speeds up. Only the velocity changes, not the sizing. This makes planning dramatically more stable.
  • Use pull-based planning as your built-in forecast. Scrum has an underappreciated technique: the Sprint is your jar, backlog items are your peas. Pull items until full. Calculate velocity by counting completed items. If items are refined to fit within one Match, they naturally become similar in size. This is simpler to learn than Planning Poker, and in practice, accuracy is comparable.
  • Fixed iterations make this work. Without Sprints or Matches, pull-based planning does not function. The fixed timebox is the reference that makes implicit forecasting possible.

Never Plan Absolute Buffers

One of the most common mistakes in planning is using absolute buffers instead of relative ones. The difference is critical.

A one-sprint buffer in a ten-sprint plan is only 10% contingency. A proper 33% buffer requires proportionally more runway.

Consider a burn-down chart where your team has an average velocity of 10. In 10 sprints it can complete 100 items. If you add a buffer of one sprint to a three-sprint timeline, that is roughly 33%, already optimistic for most projects.

But here is the problem: if you are planning 10 sprints ahead and still use just one sprint as a buffer, you are down to 10% contingency. For 20 sprints, it becomes 5%.

Buffers must scale proportionally with timeline length.

For a 33% buffer, you need:

  • 4 sprints of buffer for 10 sprints of work (count only full sprints, so 3.3 becomes 4 sprints)
  • 7 sprints of buffer for 20 sprints of work

Remember: that is only 33% contingency, which is actually optimistic for most enterprise projects.

The Value Is in the Exchange

“Plans are worthless, but planning is everything.” Dwight D. Eisenhower

The longer version adds context: “In preparing for battle, I have always found that plans are useless, but planning is indispensable.”

The planning process generates insights. The plan itself is just a snapshot.

Planning forces engagement with the future and its uncertainties. The conversation among participants surfaces critical misunderstandings and creates shared understanding that no written plan can capture.

If your estimation technique creates good communication containers where teams and leaders discover misalignments and build shared understanding, it is valuable regardless of the accuracy of the estimates produced.

But always ask: could we have this exchange without the estimation technique? If the technique only provides value through communication, perhaps there is a lighter way to achieve the same conversation.

Planning and estimation in complex environments require accepting uncertainty, using lightweight techniques, and focusing on learning over precision. The goal is not perfect accuracy but sufficient confidence to make good decisions. And the best source of confidence is not a better estimate. It is a shorter feedback loop.