← Blog
Risk · QSRA·9 Jun 2026·11 min read

Building a risk-loaded schedule: three-point estimates that mean something

A Monte Carlo engine will produce a beautiful S-curve from any inputs whatsoever. Whether the P80 it prints deserves to be believed depends entirely on work done before the first iteration runs: the state of the network, the honesty of the ranges, and a handful of modelling decisions most workshops skip. This is the practical companion to our explainer on what the simulation actually does.

Step 1: fix the network before you range anything

The simulation recalculates your network exactly as built, several thousand times. Every structural defect is therefore repeated several thousand times, and the defects are not neutral — they are systematically flattering.

An activity with no successor can absorb any sampled overrun without moving the completion date: its risk simply vanishes from the analysis. An activity with no predecessor starts on the data date in every iteration, however late the work feeding it would really run. And constraints are worse, because they don't just leak risk — they clamp the distribution. A Must-Finish-On milestone caps every iteration at the pinned date; the S-curve rises and then goes obediently vertical, and the analysis reports a near-certain finish that is an artefact of the constraint, not a property of the project.

So the first step of a credible QSRA contains no probability at all: run the DCMA 14-point check, close the open ends, justify or delete the constraints, and confirm the network actually transmits delay end to end. If the deterministic float values look wrong — and float is where broken logic shows up first — the simulation will be wrong in the same places, with more decimal places.

The vertical S-curve. If your cumulative curve hits a wall and climbs vertically at one date, you have almost certainly simulated a constraint, not a project. We see this regularly in submitted risk analyses: a P90 that equals the contract date to the day, courtesy of a Must-Finish-On nobody declared. It looks reassuring and means nothing.

Step 2: three-point estimates done honestly

The three-point estimate — minimum, most likely, maximum — is where most QSRAs are quietly won or lost. The mechanics are trivial; the honesty is not.

Where ranges should come from

Reference-class data beats workshop intuition, every time you can get it. If your organisation has actuals from previous projects — how long piling packages of this size really took, across ten jobs — the spread of those actuals is your range, and no further debate is required. The workshop's job is then to argue for adjustments ("this site has better ground conditions"), not to invent numbers from a standing start. Most teams have more historical data than they think; they just store it in the heads of people who are about to retire.

The too-narrow-range disease

Where workshop judgement is unavoidable, it arrives with a well-documented defect: people anchor on the deterministic duration and offer a polite ribbon around it. Ask an engineer for a range and you will get ±10%, almost regardless of the activity. Compare claimed ranges against realised outcomes on almost any portfolio and the truth is closer to −5%/+50%: the downside is short because work rarely finishes much early (and when it can, it isn't reported), while the upside runs long because there are a hundred ways for an activity to go wrong and three ways for it to go right.

The too-narrow-range disease (indicative) range claimed in workshop (±10%) where the actuals landed −20% 0 +20% +40% +60% +80% Deviation of realised duration from the single-point estimate. Twelve of nineteen outcomes fall outside the claimed range — all but one on the long side.
Fig 1. The claimed ±10% range versus where outcomes actually land. Symmetric, narrow ranges are an anchoring artefact, not a property of construction work. Synthetic data, indicative.

Three consequences for how you set the points:

Triangular or PERT? Mostly: it doesn't matter

The triangular distribution takes your three points literally and linearly; BetaPERT fits a smooth curve that concentrates weight near the most likely and thins the tails, so the same three points produce a slightly tighter spread and a mean pulled less towards the extremes. People argue about the choice with surprising energy. They shouldn't: the difference between triangular and PERT on the same three points is dwarfed by the difference between an honest range and an anchored one. Pick PERT if your maximums are genuinely rare worst cases, triangular if you want the extremes to carry real weight, and spend the saved meeting time on the ranges themselves.

Same three points, two shapes (min 10d · most likely 14d · max 28d) min 10d most likely 14d max 28d Triangular BetaPERT PERT concentrates weight near the most likely and thins the tails; triangular gives the extremes full weight. Indicative.
Fig 2. Triangular versus BetaPERT over an identical three-point estimate. The shapes differ; the conclusions rarely do. The range is where the analysis is won.

Step 3: don't range everything

On a 4,000-activity network, eliciting 4,000 bespoke three-point estimates is neither possible nor useful. The standard answer is banding: group activities by risk class — trade, design maturity, procurement route, weather exposure — and assign each band a percentage range (say, −10%/+35% for groundworks, −5%/+15% for proven M&E installation). Reserve bespoke estimates for the twenty or thirty activities that the deterministic critical path, the float profile and your own judgement nominate as the ones that matter. A QSRA with eight honest bands and thirty considered ranges beats one with four thousand copy-pasted ±10%s, and takes a tenth of the time.

Step 4: correlation, the honesty multiplier

Here is the quiet scandal of cheap risk analysis: sampling every activity independently makes the output narrower than the truth. If each duration is drawn separately, one activity's overrun is forever being cancelled by another's early finish, and the extremes average away. But real overruns travel in packs — they share causes. A wet winter hits every earthworks activity at once. A weak subcontractor is slow on all of their packages. Immature design bleeds into every fabrication duration downstream. When the bad versions happen together, the project's bad tail is much worse than independence predicts.

Modelling this properly is a research topic; modelling it adequately is an afternoon. Define correlation groups for the obvious common causes — weather-exposed work, each major subcontractor, design-dependent packages — and set a moderate positive correlation within each group. The S-curve widens at both ends, the P80 moves right, and the analysis stops pretending your risks have agreed to take turns.

What correlation does to the completion distribution (indicative) P80 · independent P80 · correlated independent sampling (falsely narrow) with correlation groups (honest) earlier later Same network, same ranges; only the correlation assumption differs. Independence lets overruns cancel each other out.
Fig 3. Ignoring correlation doesn't make the analysis neutral — it makes it optimistic in a specific, predictable direction. The widened curve is the honest one. Synthetic data, indicative.

Step 5: risk events are not duration uncertainty

Ranging durations captures background uncertainty — the ordinary variability of work you are definitely doing. It does not capture the things that might or might not happen at all: the planning judicial review, the main bearing failure on test, the supplier insolvency. Folding those into wider duration ranges smears a discrete event into a permanent fog and gets both wrong.

Model them instead as risk events: each one carries a probability of occurring and an impact range if it does, attached to the network as a fragnet-style insertion that only exists in the iterations where the dice say it fired. The simulation then handles both layers at once — every iteration samples the background ranges, and some iterations additionally suffer the events. This risk-driver approach also keeps the risk register and the schedule risk analysis honest with each other, which is rarer than it should be: if a register risk can't be expressed as probability, impact and a point of attack in the network, it is a worry, not a risk.

Step 6: a QSRA is a forecast, not a gate deliverable

The least respected step. A risk analysis run once, at sanction, and filed alongside the business case is a photograph of what the team believed before reality started voting. Ranges should tighten as design matures and work completes; risk events should fire, retire or escalate; the completion distribution should narrow update by update, and if it doesn't, that is a finding in itself. Treat the QSRA as part of the update cycle — re-run it when the schedule is statused, and trend the P-levels the same way you trend everything else against a properly managed baseline. A P80 that drifts three weeks to the right across two updates is the earliest, cheapest delay warning you will ever get.

Running the workshop without harvesting nonsense

Finally, the elicitation craft itself — because most bad inputs are manufactured in a single optimistic afternoon:

InputGood practiceSmell test
Network14-point check passed; constraints justified; delay propagates end to endS-curve goes vertical at one date — you've simulated a constraint
RangesReference-class actuals first; asymmetric by default; basis recordedEvery range is ±10% — anchoring, not estimating
CoverageRisk-class bands plus bespoke ranges on the activities that matter4,000 identical copy-pasted ranges
CorrelationGroups for weather, subcontractors, design maturityNone applied — distribution suspiciously narrow
Risk eventsProbability × impact, fragnet-style, mapped from the live registerRegister risks "covered" by fatter duration ranges
CadenceRe-run each update; trend P50/P80 over timeLast run dated the same month as the gate review
A workable minimum. If the full programme above is out of reach, do three things: fix the network, widen the maximums until they cover the worst comparable outcome you can actually name, and add correlation groups for weather and your two biggest subcontractors. That alone moves you out of decorative-S-curve territory.

Key takeaways

Range it, run it, read it — in your browser

ScheduleInsight's Monte Carlo QSRA runs on your P6 XER or MS Project file entirely in the browser: three-point ranges, S-curve, P-levels and tornado, nothing uploaded. The tutorials walk through a full analysis.

← PreviousQSRA explained: what Monte Carlo actually does to your schedule