Why Your VO2 Max Estimate Keeps Changing (And What It Actually Means)

19 April 2026 · Myles Bruggeling

Your Garmin says your VO2 max dropped 2 points this week. You didn’t get less fit overnight. The estimate changed because the model is fragile, and most athletes don’t know what they’re actually looking at.

VO2 max is the maximum rate at which your body can consume oxygen during intense exercise. It’s measured in millilitres of oxygen per kilogram of body weight per minute (ml/kg/min). A lab test measures it directly using a graded exercise protocol with gas exchange analysis. The number is objective and repeatable within about 3 to 5%.

The number on your wearable is not that. It’s an estimate derived from heart rate and pace data during submaximal exercise. The gap between these two things is where the confusion lives.

How Wearable VO2 Max Estimates Work

Garmin uses the Firstbeat Analytics engine (now owned by Garmin). Apple Watch uses a proprietary algorithm. Both work on the same principle: estimating cardiorespiratory fitness from the relationship between heart rate and workload during submaximal exercise.

The logic is straightforward. Fitter people have a lower heart rate at a given running pace than less fit people. If you can run at 5:30/km with an average heart rate of 145, you’re fitter than someone who runs the same pace at 165. The algorithm maps this relationship against a database of lab tested subjects and produces an estimate.

The problem is that heart rate during submaximal exercise is influenced by far more than fitness.

What Makes the Estimate Move (That Isn’t Fitness)

Heat. A 10 degree Celsius increase in ambient temperature adds 10 to 15 beats per minute to your heart rate at the same pace. Your Garmin doesn’t adjust for temperature. It sees higher heart rate at the same pace and concludes you got less fit. You didn’t. It was just hot.

Dehydration. Even mild dehydration (1 to 2% body mass loss) elevates heart rate by 5 to 10 beats through reduced plasma volume. A run done on a morning where you didn’t hydrate well will produce a lower VO2 max estimate than the same run well hydrated.

Sleep deprivation. Poor sleep elevates resting and exercising heart rate. One bad night can add 5 to 8 beats to your running heart rate. The algorithm doesn’t know you slept poorly. It just sees higher heart rate at the same pace.

Caffeine. Caffeine can raise heart rate by 3 to 7 beats. Run with coffee and your estimate might differ from a fasted run.

Fatigue from prior training. Heart rate on an easy run the day after a hard interval session will be higher than normal. The estimate drops. Not because you lost fitness but because you’re still recovering from the session that built fitness.

Stress. Psychological stress elevates sympathetic nervous system activity. This raises baseline heart rate and exercise heart rate. A stressful week at work can cause your VO2 max estimate to drop without any change in actual fitness.

GPS accuracy. Pace is the other input to the calculation. If your GPS track was inaccurate (tree cover, buildings, wrist position), the pace data is wrong and the estimate is wrong. Running in an urban canyon can produce wildly different VO2 max estimates compared to the same run on an open road.

The Meaningful Trend Is Months, Not Days

True VO2 max changes slowly. Very slowly. For a well trained athlete, meaningful improvements happen over months of consistent training. Typical gains are 3 to 5% over a 12 week block of structured aerobic development. Losses take about 2 to 3 weeks of detraining to begin and progress at roughly 5 to 7% per month of inactivity.

Daily or weekly fluctuations in your wearable’s estimate are almost entirely noise from the confounding factors listed above. A 2 point drop from Monday to Friday is not a fitness change. It’s measurement variability.

The useful signal from VO2 max estimates is the 30 to 90 day trend. If your estimate has been gradually climbing over three months of training, your cardiorespiratory fitness is genuinely improving. If it’s been gradually declining over three months despite consistent training, something systemic is wrong (overtraining, underfuelling, poor sleep quality over time).

Day to day movements should be ignored. Weekly averages are marginally useful. Monthly averages are where the actual signal lives.

The Discrepancy Problem

Ask anyone who owns both a Garmin and an Apple Watch what their VO2 max is. You’ll get two different numbers. Often by 3 to 6 ml/kg/min.

This happens because the algorithms use different models, different calibration databases, and different data inputs. Garmin uses running dynamics (if available from a chest strap or HRM Pro), walking data, and a broader exercise type library. Apple uses primarily walking and running heart rate data.

Neither is necessarily more accurate than the other. Both are estimates with confidence intervals of roughly plus or minus 5 ml/kg/min for any individual measurement. The true value sits somewhere in the range, and you won’t know exactly where without a lab test.

For context, a VO2 max estimate of 45 on your watch could mean your actual VO2 max is anywhere from 40 to 50. That range spans from “recreational fitness” to “competitive age group athlete.” The estimate is too imprecise to place you in a meaningful fitness category on any single reading.

Over time, with many readings, the average converges closer to your true value. But the individual readings are only directionally useful.

What VO2 Max Estimates Are Actually Good For

Despite their limitations, wearable VO2 max estimates serve two useful purposes.

First, long term trend tracking. If you’ve been training consistently for six months and your 30 day average VO2 max estimate has climbed from 42 to 46, you’ve improved your aerobic fitness. The absolute number might be off, but the direction and magnitude of change are meaningful.

Second, comparison with yourself over time. Your estimate’s biases (the specific confounders that affect your readings, the GPS quality in your usual routes, your typical hydration state) are relatively consistent. This means your personal trend, measured under similar conditions, is more reliable than comparing your number to someone else’s or to a population table.

What they’re not good for: comparing yourself to other athletes, making race pace predictions, or assessing whether you’re “fit enough” for a specific goal based on a single reading.

The Age Factor

VO2 max declines with age. The rate is roughly 5 to 10% per decade after age 30, though consistent training slows this significantly. A well trained 50 year old might have a VO2 max of 45 to 50 ml/kg/min, which would be “superior” for their age despite being average for a 25 year old athlete.

Most wearable platforms show VO2 max against age grouped percentiles, which is helpful. But the estimate itself is still subject to all the same confounders regardless of age. And the precision of the estimate may actually be worse for older athletes because the calibration databases are weighted toward younger subjects.

For athletes over 40, the practical advice is: watch the trend over 3 to 6 months. Ignore individual readings. Don’t panic if it drops 3 points during a heavy training block (it will recover). And if you want to know your actual VO2 max, book a lab test. The $150 to $200 cost buys you a number that’s accurate to within 3%, compared to the plus or minus 5 ml/kg/min you’re currently working with.

Where This Fits in the Bigger Picture

VO2 max is one input to understanding athletic performance. It matters, but it’s not the only thing that matters. Lactate threshold, running economy, durability (ability to maintain performance under fatigue), and body composition all contribute to race performance independently of VO2 max.

Two athletes with identical VO2 max values can have drastically different race times because their thresholds, economy, and fatigue resistance differ. Chasing a VO2 max number in isolation optimises for one piece of a complex system.

A useful performance platform would contextualise VO2 max estimates alongside threshold estimates, efficiency trends, and body composition data. It would say “your VO2 max is stable but your cardiac drift is improving, suggesting your efficiency is developing even though your ceiling hasn’t changed.” That’s actionable. A number going up or down in isolation isn’t.

Your wearable gives you a number. The context that makes it useful lives somewhere else. Usually in your own head, if you know what to look for.

Green score. Destroyed legs. There are 6 blind spots in your wearable data. We wrote a free guide covering every one of them.

Download the Free Guide