Melatonin Detection Model
TL;DR
Direct wearable detection of melatonin ingestion is not validated and is not possible with current consumer devices. Any melatonin-specific inference from Apple Watch, Oura, Whoop, or similar devices is proxy-only and not individually reliable. The only defensible use case is population-level signal consistency or user self-report cross-validation.
⚠️ human_signoff_required: true — All biometric inference for melatonin is proxy-only. Do not build automated detection logic that claims to identify melatonin use from wearable data alone.
Detection approach
What is being detected
Not melatonin itself (no biochemical sensor exists in consumer wearables). Instead, a melatonin detection model would infer melatonin use from downstream physiological effects observable by wearables.
Validated signals (from RCTs)
| Signal | Evidence | Wearable validity |
|---|---|---|
| Sleep-onset latency (SOL) reduction (~7–17 min at 0.5–5 mg) | Strong (multiple RCTs/meta-analyses) | Proxy only — effect size within consumer device error margin |
| Elevated nocturnal HRV (lab ECG) | Moderate (lab studies) | Proxy only — consumer wearable HRV never validated for melatonin |
| Reduced resting heart rate | Moderate (RCTs) | Proxy only — RHR is non-specific |
| Sleep efficiency improvement | Moderate (some RCTs) | Proxy only — sleep efficiency is well-measured by wearables but melatonin-specific inference is not |
Unvalidated signals
| Signal | Problem |
|---|---|
| Sleep staging shifts (REM↑, SWS↑/↓) | Apple Watch sleep staging κ=0.20 vs PSG — essentially no agreement |
| HRV frequency-domain changes (LF, HF) | Consumer wearables don’t reliably measure these |
| Core body temperature | Consumer wearables don’t measure CBT |
| Melatonin concentration | No consumer wearable measures this |
Inference model framework
Input signals
melatonin_detection_score = f(
sol_delta, # change in sleep-onset latency (user-tracked or wearable-derived)
hrv_delta, # change in nocturnal HRV (rMSSD)
rhr_delta, # change in resting heart rate
timing_alignment, # was melatonin taken 30min–3h before bedtime?
light_exposure, # was dim-light maintained after dosing?
baseline_confound # pre-existing SOL, HRV, RHR severity
)
Confidence boundaries
- High confidence: User self-reports melatonin use + shows SOL improvement of >15 min + took it 30min–3h before bed + maintained dim-light
- Low confidence: Any single biometric signal in isolation
- Not detectable: From wearable data alone without user disclosure
Key confounds
| Confound | Direction | Impact |
|---|---|---|
| Sleep restriction / recovery | Can produce SOL improvements independent of melatonin | High |
| Alcohol use | Produces opposite HRV/SOL profile — excludes rather than confirms | High |
| Caffeine timing | Can delay SOL independent of melatonin | Moderate |
| Exercise timing | Late exercise can shift circadian phase | Moderate |
| Other sleep aids | Benzodiazepines, antihistamines, L-theanine, magnesium — all affect biometrics | Moderate |
| Device model | Different wearables have different SOL/HRV accuracy | Moderate |
| Circadian phase delay | User may be taking melatonin at wrong time — phase delay worsens SOL | High |
Detection is not the goal
The primary Vitals use case is not detecting melatonin use (users should self-report) but validating coaching efficacy and detecting timing errors:
- A user on melatonin who shows no SOL improvement → possible timing error, insufficient dose, or wrong etiology
- A user who reports morning melatonin use → flag for timing correction (morning use can cause phase delay)
- A user showing large SOL improvement with correct timing → consistent with melatonin benefit; no automated detection required
What this model cannot do
- Cannot detect melatonin from a blood, saliva, or breath sensor — no such consumer wearable exists
- Cannot reliably distinguish melatonin effect from placebo at the individual level — the effect is modest and within noise
- Cannot detect compliance — HRV/RHR changes are too non-specific
- Cannot support product quality claims — contamination and dose variance are not visible in biometrics
Coaching integration
| Scenario | Appropriate action |
|---|---|
| User self-reports melatonin use; biometrics improve | Accept self-report; log for longitudinal tracking |
| User denies melatonin use; biometrics suggest possible benefit | Do not infer; ask about sleep habits, timing, environment |
| User reports morning melatonin use | Flag: morning use risks phase delay; suggest PM timing |
| User shows no SOL benefit despite melatonin use | Investigate timing, dose, light environment, or wrong insomnia etiology |
| User on fluvoxamine (CYP1A2 inhibitor) using melatonin | Flag: severe interaction; 17-fold AUC increase; do not recommend melatonin |
Product sourcing signal
Note: Melatonin product contamination (71% off-label dose; 26% contain undisclosed serotonin) is not detectable from wearable biometrics. This is a sourcing guardrail, not a detection-model signal. Wearables cannot distinguish a sub-potent product from a correctly-dosed one.
→ See Melatonin hub for product sourcing recommendations.
Related notes
- Melatonin — hub note; dosing, evidence, safety
- Melatonin Sleep Biometrics — which sleep metrics melatonin changes and how reliably
- Sleep architecture — melatonin does not meaningfully alter sleep stages at physiological doses; sleep staging via consumer wearables is unreliable
- HRV — Apple Watch Limits — Apple Watch HRV accuracy caveats relevant to any melatonin HRV inference
- Circadian Biology — SCN signaling; timing is the critical variable for melatonin efficacy
- Cannabis detection model — contrast: cannabis detection is more feasible via HRV delta + REM + motion (AUC ~0.75–0.85); melatonin detection is harder due to smaller effect sizes and absence of a distinctive biometric signature
- Alcohol detection model — contrast: alcohol has a more distinctive biometric profile (HRV suppression + elevated RHR + sleep efficiency drop)