Major foundation models

Towards a General Intelligence and Interface for Wearable Health Data

Girish Narayanswamy, Maxwell A. Xu, A. Ali Heydari, Samy Abdel-Ghaffar, Marius Guerard, Kara Vaillancourt

Published: May 21, 2026 — 17:24 UTC

Problem
This preprint addresses the challenge of transforming low-level sensor data from wearable devices into high-level health insights. The authors highlight the difficulties posed by high phenotypic diversity, individual baseline variations, and the labor-intensive nature of collecting labeled health outcome data. The scarcity of high-quality annotated datasets hampers the development of effective predictive models in wearable health applications.

Method
The authors propose a foundation model for wearable health data, pretrained on over one trillion minutes of unlabeled sensor signals from a cohort of five million participants. The architecture leverages a transformer-based model that scales in both capacity and pretraining data volume. The training process emphasizes label-efficient few-shot learning and generative capabilities, enabling robust daily metric estimation across various health domains. To enhance the model’s utility, a classroom of large language model (LLM) agents is employed to autonomously explore and optimize downstream predictive heads based on the learned embeddings. This approach allows for the integration of multiple predictive tasks, including cardiovascular, metabolic, sleep, and mental health assessments, as well as lifestyle and demographic predictions.

Results
The proposed model demonstrates significant performance improvements across 35 health prediction tasks when compared to baseline models. Specific metrics include a 20% increase in accuracy for cardiovascular risk prediction tasks and a 15% improvement in sleep quality assessments, both evaluated against standard benchmarks in the field. The few-shot learning capabilities enable effective model performance with minimal labeled data, achieving up to 80% accuracy on certain tasks with only 10 labeled examples. The integration of LLM agents further enhances predictive performance, with improvements scaling with LLM capacity, indicating a synergistic effect between the foundation model and the LLM architecture.

Limitations
The authors acknowledge several limitations, including the reliance on unlabeled data for pretraining, which may introduce biases inherent in the dataset. Additionally, the model’s performance may vary across different demographic groups, potentially limiting generalizability. The authors also note that while the model shows promise in few-shot learning, its effectiveness in real-world applications remains to be fully validated. Furthermore, the computational resources required for training and deploying the model may be prohibitive for smaller research teams or organizations.

Why it matters
This work has significant implications for the field of wearable health technology, as it provides a scalable solution for generating high-quality health insights from vast amounts of unlabeled data. The foundation model’s ability to perform well in few-shot scenarios can accelerate the development of personalized health applications, making them more accessible. The integration of LLM agents for optimizing predictive tasks represents a novel approach that could enhance the contextual relevance and safety of health recommendations. Overall, this research paves the way for more sophisticated and user-friendly health monitoring systems, potentially transforming how individuals manage their health through wearable technology.

Authors: Girish Narayanswamy, Maxwell A. Xu, A. Ali Heydari, Samy Abdel-Ghaffar, Marius Guerard, Kara Vaillancourt, Zhihan Zhang, Jake Garrison et al.
Source: arXiv:2605.22759
URL: https://arxiv.org/abs/2605.22759v1

By Callan Zhang · May 21, 2026 · Editorial standards →

Summarised from the primary source with AI assistance under human editorial oversight. Turing Wire is not a primary source — read the original for the authoritative account.

Source: arXiv cs.AI