You test your threshold on a Tuesday morning. Fresh legs, good sleep, well fuelled. The numbers look great. Two months later, you race a 70.3. The first hour goes to plan. By hour three, you are hanging on. By the back half of the run, the power and pace that felt sustainable in testing feel impossible.
The test was not wrong. You just assumed the number would hold.
What Durability Actually Is
Your threshold is not a fixed line. It shifts. Under fatigue, heat, dehydration, and substrate depletion, the physiological markers that define your thresholds drift downward over the course of prolonged exercise. Your first threshold drops. Your second threshold drops. The gap between them compresses. The metabolic state that was comfortable at minute twenty is no longer comfortable at hour three.
This is durability. It is the rate at which your physiological ceiling decays during sustained effort. An athlete with high durability can maintain their threshold outputs deep into a race. An athlete with low durability watches those outputs crumble as the hours pass, even when the initial numbers looked strong.
Two athletes with the same threshold power measured in a fresh test can have dramatically different thresholds at three hours. One might hold 90 percent of their tested output. The other might hold 70 percent. On race day, that difference is not subtle.
Why Your Test Does Not Tell You This
Standard threshold testing measures your physiology in a fresh state. Whether it is a 20-minute effort, a ramp test, or a time trial, the data is valid. But it only captures one point on what should be a curve.
Olav Bu, the physiologist behind Norway's Olympic triathlon programme, has made this point directly. He advocates for building what he calls a metabolic duration curve: sampling the same physiological markers at intervals throughout a long session rather than relying on a single snapshot. The threshold you measure at ten minutes is not the threshold you carry at sixty minutes. Both are real. But only one predicts what happens at hour four of a race.
Your first threshold measured fresh is not your first threshold under fatigue. It shifts with substrate availability, core temperature, hydration status, and accumulated metabolic stress.
Think about what this means for your training zones. Your five-zone structure is built from thresholds identified in a fresh state. Zone 3 sits between your first and second threshold. Zone 2 sits below the first threshold. But as fatigue accumulates during a long race, both thresholds drop. The intensity that was comfortably Zone 2 at the start of the bike leg might be Zone 3 by the time you start the run. You have not changed your effort. Your physiology has changed underneath you.
This is why athletes who test well but race poorly are not underperforming. They are performing exactly as their durability allows.
What Builds Durability
Durability is an aerobic quality. It is built through sustained sub-threshold volume over months and years. The adaptations that slow threshold decay are the same adaptations that define a strong aerobic engine: mitochondrial density, capillarisation, substrate transport capacity, and efficient heat dissipation. These are slow-building, training-age-dependent qualities that do not respond to shortcuts.
This is where the pyramidal training model earns its keep. The roughly 70 percent of training time spent in Zones 1 and 2 is not filler. It is not junk mileage. It is the stimulus that builds the oxidative infrastructure which keeps your thresholds from collapsing under sustained load. The 25 percent spent in Zone 3, between the thresholds, develops the specific capacity to sustain work near those thresholds for longer. Together, they build an engine that holds.
Athletes who chase intensity at the expense of this volume are training their fresh-state ceiling without training the ability to maintain it. The 20-minute test improves. The race does not. The ceiling looks higher, but the floor underneath it is just as fragile.
Bu has been direct about this. Fatigue, he has argued, is too often neglected in training because it is uncomfortable and difficult to prescribe. But if you never train the body's ability to perform under accumulated fatigue, you will never develop the adaptation required to sustain performance when it matters. Short segments in a fresh state do not replicate the physiological demands of a four-hour or eight-hour race.
This is not a new insight, but it is one the broader endurance community is only now formalising. Researchers have begun studying durability as a distinct, measurable, and trainable physiological quality. The data confirms what experienced coaches have long observed: the athletes who perform best in long races are not necessarily those with the highest fresh-state thresholds. They are the ones whose thresholds hold up under prolonged stress.
Why Intensity Cannot Fix This
There is no intensity session that efficiently builds durability. Threshold intervals improve your threshold. VO2max intervals improve your ceiling. Neither teaches your body to resist the decay that sets in over hours of sustained output.
Durability requires time under sub-threshold load. It requires the body to manage substrate utilisation across hours, to regulate core temperature across changing conditions, and to maintain contractile efficiency as fatigue accumulates. These are adaptations to duration, not intensity. They develop in the long sessions, the repeated weeks of consistent aerobic work, and the patient accumulation of training age.
This is not an argument for mindless volume. The quality of the sub-threshold work matters. Execution matters. But the adaptation itself is fundamentally a product of sustained aerobic stress, and there is no way to compress that stimulus into short intervals on a Tuesday.
What This Means for Your Training
If your test numbers look good but your races do not reflect them, do not chase a higher threshold. Chase a more durable one. The question is not how high your ceiling sits when you are fresh. The question is how high it sits when you have been racing for three hours.
Building a power-duration or velocity-duration curve across multiple durations, and updating it regularly, shows both your fresh-state capacity and how that capacity holds over longer efforts. When the short-duration outputs are strong but the long-duration outputs lag behind, the curve is telling you that durability is the limiter. The prescription is not more intensity. It is more time spent building the aerobic foundation that keeps your physiology stable as the race gets long.
This is a patient process. Durability improves with training age. It responds to consistency across months and years, not to any single training block. The athletes who race best in the final hour are the ones who spent the previous twelve months building the oxidative infrastructure to support it.
Your threshold is only useful if it lasts. The athletes who race well are not the ones who test the highest. They are the ones whose numbers hold when it matters.