Garmin VO2max: How Far Off Is Your Number? (79-Pair Cohort, 2026)
Your Garmin says 52. Is it right? In our cohort of paired Powertests, 68% of athletes got a Garmin VO2max number higher than their measured value — and the size of the error depends on how fit you already are.
What Garmin Actually Measures
Modern Garmin watches do not measure VO2max in the lab-physiology sense. They estimate it from the relationship between heart rate and pace (or power on a cycling computer) during your everyday rides and runs. The algorithm — most commonly the FirstBeat HRV-and-load model — looks for sub-maximal data points and projects forward to what your maximum oxygen uptake would be if you pushed all the way to exhaustion.
It is a remarkably good engineering trick for a free, always-on estimator. But "estimator" is the right word, and an estimator carries a bias. The question for any athlete reading a number on the wrist is: how big is the bias in my fitness range, and which direction does it pull?
How We Tested It
A Mader-model Powertest measures both VO2max and VLamax directly from a short maximal protocol. We have run more than 13,300 valid Powertests on our cycling cohort, and a subset of those athletes also wear a Garmin (or Garmin-style) watch that produces a VO2max estimate within the comparison window.
For this analysis we pulled 79 paired observations — bike Powertests with a paired anonymised watch VO2max estimate dated close enough that no significant fitness change happened between them. Eighty pairs would not change the headlines below by more than 0.1, but it is the right ceiling to be transparent about.
We are not claiming "15,000 watch comparisons". We are claiming 79 — and the bucket-stratified pattern those 79 reveal is consistent enough across fitness ranges that we are comfortable publishing it.
The Headline Numbers
| Metric | Value |
|---|---|
| Mean bias (watch − Powertest) | +3.19 VO2max points |
| Within ±3 points of measured | 41.8% |
| Within ±5 points of measured | 57.0% |
| Watch over-estimated | 68.4% |
| Watch under-estimated | 31.6% |
A "true" estimator would scatter symmetrically around the measured value. Garmin's distribution is tilted: most athletes get a slightly-to-substantially higher number than the lab returns. That bias is the most reliable feature of the data.
The Bias Curve — And Why It Flips
The single most useful chart we can give you is the bias broken out by fitness bucket. It is monotonic and, at the elite end, it inverts.
| Powertest VO2max bucket | n | Powertest avg | Watch avg | Bias (watch − PT) |
|---|---|---|---|---|
| <45 (untrained) | 10 | 39.7 | 51.5 | +11.79 |
| 45–55 (recreational) | 30 | 50.5 | 55.0 | +4.52 |
| 55–65 (trained) | 25 | 60.9 | 62.1 | +1.19 |
| 65+ (elite) | 14 | 70.6 | 68.3 | −2.23 |
Read this from top to bottom. An untrained rider with a true VO2max of 40 gets a watch reading near 52 — an inflation of nearly twelve points. A trained rider near 60 reads almost on the mark. An elite at 70+ actually gets under-reported by about two points.
Why Regression Toward the Mean Explains This
The pattern is a textbook signature of an estimator that anchors on a population prior. Loosely: the algorithm has a built-in expectation that "most people are around 55–58 VO2max". When the rest of the data (your sub-maximal HR-to-power slope, your demographics, your training history on the watch) is ambiguous, the model pulls toward that prior. The further you sit from the prior, the more your wrist number drifts back toward it.
For most enthusiast readers the upshot is simple: a wrist number near 55 is the most trustworthy slice of the curve. Below or above that, expect compression.
How to Interpret Your Garmin Number
Here is a one-step correction you can apply with no math:
| Self-assessment | Garmin reads | Most likely true VO2max |
|---|---|---|
| New to structured training, mostly easy rides | 48–55 | Subtract 10–12 points |
| Recreational, some intervals, 4–6 h/week | 53–58 | Subtract 4–5 points |
| Trained, structured plan, racing | 60–65 | Trust the number ±1–2 |
| Elite or near-elite, frequent racing | 65–72 | Add 2 points |
This is not a substitute for a measurement — it is a sanity check. The corrections are central-tendency estimates for our cohort; individual variation around them is significant (standard deviation of the bias across all 79 pairs is 5.7 points). Treat them as "your Garmin number is probably in this range", not "your true value is exactly this".
What About Apple Watch and Polar?
We do not yet have a paired comparison cohort for non-Garmin watches at the same level of methodological rigour. From the published literature and the broader VO2max-watch comparison work that does exist:
Apple Watch uses a similar HR-and-pace inference (built on the Sports Science team's own model rather than FirstBeat). Independent studies report mean biases in the same ballpark — roughly +2 to +5 points for running, somewhat noisier for cycling without paired power data. The directional bias toward over-estimation appears to hold.
Polar is the most honest of the three about uncertainty: the watch returns a VO2max value with a stated confidence band, and Polar's documentation explicitly notes that the estimate degrades for fitness levels far from the population centre. Published comparisons report bias magnitudes similar to Apple's, with slightly tighter dispersion.
The honest summary: across all three brands, the direction of the bias (over-read for the untrained, accurate near the trained mean, under-read at the elite end) appears to be a property of HR-based VO2max estimation in general, not a specific brand failing.
The Measurement That Doesn't Need a Correction
If you want a number you can actually plan training from — not one that needs a bias-curve lookup — the Mader-model Powertest takes 25 minutes and returns both your VO2max and your VLamax (the second engine the watch cannot see at all). Most A Faster You athletes do their first one in their first week and re-test every 4–6 weeks after that.
For deeper reference on what "trained" actually looks like across distances and rider types, our larger cohort report compares 3,491 athletes to ACSM norms: VO2max & VLamax cohort report →
Start a free trial of A Faster You and your first Powertest sets the baseline. From there, the AI schedules every interval and every recovery day from real numbers — not estimates.
FAQ
Why does my Garmin VO2max change by 2 points day to day? The estimator updates after every aerobic activity, and the underlying HR-to-pace relationship is noisy: caffeine, sleep, ambient temperature, and hydration all shift it by a few percent. Trust the 28-day trend, not any single day. The day-to-day jitter is normal and is not a sign that your fitness genuinely fluctuated.
Is the watch number reliable as a trend, even if the absolute value is off? Mostly yes. The bias direction tends to be consistent for a given athlete over months, so a watch-reported increase from 50 to 53 generally reflects a real fitness gain, even if your true VO2max is closer to 48 throughout. Treat the trajectory as a signal; treat the absolute number as a rough anchor.
Does running vs. cycling matter for the watch number? Yes, by a lot. Most consumer watches return a higher running VO2max than cycling VO2max for the same athlete, because the algorithms are tuned on running data. If your watch shows two numbers, the running one is the better-validated of the two. The Powertest comparisons in this article are bike-paired; running-paired data is a separate cohort cut we will publish later.
My Garmin says 60 but my Powertest came back 50. Which should I believe? Believe the Powertest. The lab-grade measurement is the reference; the watch number is the inferred estimate. A 10-point gap is on the higher end of what we see, but it is in line with what the bias curve predicts for a recreational rider whose Garmin sits near the population mean.
Will my Garmin number get more accurate if I do more rides? Slightly, yes — the algorithm benefits from more sub-maximal data points and from heart-rate samples close to threshold. But it will not converge on your true value; the bias is structural, not data-volume-limited.
Does training to push my VO2max up actually move the watch number? Yes, in the same direction, but compressed. Adding three real points (Powertest +3) typically shows up as one to two watch points in our cohort, because the watch is biased toward the population centre. Plan from the Powertest delta, not the watch delta.
Are the watch VO2max numbers safe to share with a coach? For broad context, yes. For prescribing training zones, no — wrist-based VO2max is too noisy to anchor zone calculations. A coach who plans from a watch number is planning from an estimate of an estimate.
Sources: Mader, A. (2003) — Glycolysis and oxidative phosphorylation as a function of cytosolic phosphorylation state and power output of the muscle cell. Eur J Appl Physiol, 88(4–5), 317–338. Mader, A. & Heck, H. (1986) — A theory of the metabolic origin of the "anaerobic threshold". Int J Sports Med, 7 Suppl 1, 45–65. Bassett, D.R. & Howley, E.T. (2000) — Limiting factors for maximum oxygen uptake. Med Sci Sports Exerc, 32(1), 70–84. FirstBeat Technologies — Automated Fitness Level (VO2max) Estimation with Heart Rate and Speed Data (white paper, 2014, updated 2019). Cohort: 13,300+ valid Powertests with 79 bike-paired Garmin-style VO2max observations, A Faster You internal database, snapshot 2026-05-12.