You have more health data on your wrist than any athlete in history. You also have less idea what to do with it than a coach with a stopwatch and a notebook in 1985.
That’s not a technology problem. It’s an interpretation problem. And after a week of writing about specific metrics, from HRV trends to sleep architecture to respiratory rate, the pattern is clear enough to name directly.
The gap isn’t data. The gap is context.
The Metrics Paradox
Twenty years ago, a serious recreational athlete might track three things: resting heart rate (manual, morning), training volume (hours and distance in a logbook), and body weight (bathroom scale). That’s it. Three data points. And coaches made remarkably good decisions with them.
Today, the same athlete might have access to: heart rate (continuous, 24/7), heart rate variability, sleep duration, sleep stages, sleep efficiency, skin temperature, respiratory rate, blood oxygen, training load (acute and chronic), training effect, VO2 max estimate, body battery, stress score, recovery score, readiness score, strain score, step count, active calories, resting calories, floors climbed, intensity minutes, and body composition (if they have a smart scale).
That’s 20+ metrics, collected automatically, every single day. And the athlete is often making worse decisions than the one with three metrics and a notebook.
Why? Because more data without better interpretation creates confusion, not clarity. Each metric tells a partial story. Together, they should tell a complete story. Instead, they tell 20 different partial stories that sometimes agree and sometimes contradict each other.
Your Whoop says recovery is green. Your Garmin says body battery is 35. Your Oura says readiness is 72. Your subjective feel is “tired but not destroyed.” Which signal do you trust? How do you reconcile four different assessments of the same underlying question: am I ready to train?
Most athletes default to one of two strategies. They pick their favourite metric and ignore the rest (anchoring on a single data point). Or they average across signals mentally and hope for the best (gut feel with a data veneer). Neither is optimal. Both are rational responses to an information environment that gives you everything except the one thing you need: a synthesis.
What Synthesis Actually Means
Synthesis is not averaging. It’s not weighting. It’s reasoning.
When a coach looks at an athlete’s data and says “I know your HRV is fine but I want you to go easy today,” they’re synthesising. They’re incorporating information the data doesn’t capture: the athlete’s training phase, the importance of tomorrow’s session, the emotional stress from work, the fact that they travelled yesterday, the knowledge that this athlete specifically tends to push through when they shouldn’t.
That’s context. And context transforms data from noise into signal.
A low HRV reading means different things depending on whether you’re:
- In week 3 of a heavy build block (expected, proceed with caution)
- In a deload week (unexpected, investigate)
- Coming back from illness (expected, be patient)
- Training for a race in 2 weeks (concerning, protect the taper)
- Dealing with work stress while training is light (not training related, manage stress)
Same number. Five different interpretations. Five different appropriate responses. The data point is identical in all cases. The context changes everything.
Why Wearables Can’t Do This (Yet)
Consumer wearable platforms face a genuine technical and philosophical challenge. They need to make recommendations that are useful for millions of users without knowing anything about each individual’s specific situation.
So they optimise for the general case. Recovery models use population level training response curves. Sleep recommendations target generic adult guidelines. Training load thresholds use published research averages.
This works reasonably well for the average user doing average training with average goals. It breaks down for anyone whose situation is specific, which is everyone who cares enough to read a blog post about HRV trends.
The athlete training for a July Hyrox while managing bilateral knee rehab while balancing a full time job and family commitments. The 52 year old trying to maintain muscle mass while building running endurance for a half marathon. The couple training for a partner event where one has stress incontinence concerns and the other has aggravated patellae.
These are real people with real contexts. A generic recovery score doesn’t serve them well. Not because it’s wrong. Because it’s not wrong enough to be obviously useless, so they trust it more than they should.
The Three Layers of Useful Data
After spending the past week examining individual metrics, a framework emerges for how health and fitness data should work.
Layer one is measurement. This is what wearables do well. Heart rate accuracy is within a few beats per minute. Sleep stage detection is roughly 80% accurate compared to polysomnography. Step counting is solid. GPS pace is reliable. The sensors work. The raw data is good enough.
Layer two is trends. This is what wearable apps do adequately. They show you 7 day, 30 day, and 90 day trend lines. They calculate rolling averages. They highlight deviations from baseline. Some platforms do this better than others (Garmin’s HRV status is genuinely well implemented, Oura’s temperature tracking is best in class), but the basic capability exists across all major platforms.
Layer three is interpretation. This is what almost nobody does. Interpretation means taking the trend from layer two, combining it with knowledge about the athlete’s training plan, phase, goals, age, injury history, and life context, and producing a specific recommendation that accounts for all of those factors.
Layer one is a solved problem. Layer two is a partially solved problem. Layer three is an unsolved problem that represents virtually all of the remaining value in wearable health data.
The Coach in Your Pocket (That Isn’t)
“Coach in your pocket” is a phrase that gets thrown around in wearable marketing. But a coach doesn’t just read your numbers. A coach knows your history. A coach remembers that you tweaked your ankle three weeks ago and it’s probably not fully healed even though it stopped hurting. A coach noticed that your performance drops every time you have a stressful week at work. A coach knows that you tend to underreport fatigue because you don’t want to appear weak.
A coach integrates quantitative data (heart rate, pace, load) with qualitative data (mood, motivation, life stress, injury history) and makes a judgment call that accounts for both.
Your wearable has no access to the qualitative layer. It doesn’t know you’re going through a divorce. It doesn’t know your kid has exams and the household stress is elevated. It doesn’t know you just started a new medication. It doesn’t know you’re motivated today but have been dreading training all week.
Some platforms have tried to bridge this with subjective input. Whoop asks you to log journal entries about alcohol, caffeine, screen time, and other behaviours. Oura asks about your day. These are steps in the right direction. But they’re still disconnected from the quantitative data in any meaningful analytical way. The journal entry doesn’t change the recovery score. It sits alongside it in the app like two strangers at a bus stop.
What Changes This
The technology to do proper synthesis exists. It’s not a hardware problem. It’s not even primarily a software problem. It’s a data integration problem combined with a domain knowledge problem.
Integration means connecting wearable biometric data with training plan data, lifestyle data, and historical context into a single analytical framework. The APIs exist. The data formats are known. The technical barriers are about business partnerships and data sharing agreements, not technology.
Domain knowledge means encoding the expertise that coaches, sports scientists, and sports medicine professionals use daily into software. “An athlete over 45 with declining HRV trend during week 3 of a build phase who has a history of tendon issues should reduce eccentric loading and maintain aerobic volume.” That’s a rule. It can be codified. It requires expertise to write but not to execute.
The combination of integration and domain knowledge produces interpretation. And interpretation is what turns a dashboard full of numbers into a decision you can act on.
This Is the Future of the Category
Every major wearable company will eventually move toward synthesis. The measurement race is over. Heart rate accuracy is good enough. Sleep staging is good enough. The hardware improvements from here are incremental.
The next competitive advantage in this market is not a better sensor. It’s a better answer to the question: “Given everything my device knows about me, what should I actually do today?”
That answer requires context. Your training plan. Your goals. Your age. Your injury history. Your life situation. Your specific sport’s demands. Your personal recovery patterns observed over months and years.
The athlete who gets this kind of personalised, contextual, multi signal analysis will make better training decisions than the athlete staring at five different app screens trying to reconcile five different numbers.
Right now, the only way to get that synthesis is to hire a coach or become extremely good at cross referencing your own data across multiple platforms while applying sports science principles you’ve self taught.
That works for the dedicated few. It leaves everyone else looking at recovery scores that are technically accurate and practically incomplete.
The data is already on your wrist. The context is in your head. The synthesis that connects them is the product nobody has built yet.
Green score. Destroyed legs. There are 6 blind spots in your wearable data. We wrote a free guide covering every one of them.
Download the Free Guide