A robust PPG foundation model using multimodal physiological supervision
Eloy Geenjaar, Vince Calhoun, Scott Daly, Gouthaman KV, Lie Lu, Trisha Mittal, Daniel P. Darcy
Why It Matters
What makes this one worth your time
This approach could lead to more reliable PPG-based applications in consumer devices and clinical settings by improving model performance on noisy data.
A novel PPG model uses multimodal supervision to enhance robustness and generalization.
Summary
The paper proposes a PPG foundation model that leverages multimodal physiological signals, such as electrocardiogram and respiratory data, to improve the robustness and generalization of PPG models without requiring high-quality or field-like pretraining data.
Key contributions
- Introduction of a PPG foundation model using multimodal supervision.
- Demonstration of improved performance on diverse downstream tasks with fewer training subjects.
Notable insights
- Utilizing multimodal physiological signals for contrastive sample selection during pretraining.
- Achieving performance improvements with fewer subjects compared to state-of-the-art models.
Possible limitations
- Not stated in the abstract
Abstract
arXiv:2606.07365v1 Announce Type: cross Abstract: Photoplethysmography (PPG), a non-invasive measure of changes in blood volume, is widely used in both wearable devices and clinical settings. Recent PPG foundation models either use open-source ICU datasets with pretraining paradigms that require curated data and thus complicate generalization to field-like data, or use closed-source field-like PPG data. In contrast, we propose a PPG foundation model that does not require high-quality or field-like pretraining data, and instead leverages accompanying electrocardiogram and respiratory signals in ICU datasets to select contrastive samples during pretraining. Our approach allows the model to retain and learn from noisy PPG segments, improving robustness at inference. Our model, pretrained on 3x fewer subjects than existing state-of-the-art approaches, achieves performance improvements on 14 out of 15 diverse downstream tasks, including field-like daily activity and heart rate prediction. Our results demonstrate that multimodal supervision can integrate complementary physiological information to improve the robustness of PPG foundation models and enhance their generalization to consumer-grade data.