# Learning Unified Representations from Heterogeneous Data for Robust Heart Rate Modeling

**Authors:** Zhengdong Huang, Zicheng Xie, Wentao Tian, Jingyu Liu, Lunhong Dong, Peng Yang

arXiv: 2508.21785 · 2026-02-25

## TL;DR

This paper introduces a novel framework for heart rate prediction that effectively handles data heterogeneity by learning unified representations, improving robustness and accuracy across diverse devices and individuals.

## Contribution

It proposes a new method combining feature dropout, attention, and contrastive learning to model heterogeneous data, and introduces a benchmark dataset PARROTAO for evaluation.

## Key findings

- Outperforms existing methods with 17.5% and 10.4% lower test MSE on PARROTAO and FitRec datasets.
- Learnt representations are highly discriminative and useful for downstream tasks.
- Model demonstrates robustness to device and user heterogeneity.

## Abstract

Heart rate prediction is vital for personalized health monitoring and fitness, while it frequently faces a critical challenge in real-world deployment: data heterogeneity. We classify it in two key dimensions: source heterogeneity from fragmented device markets with varying feature sets, and user heterogeneity reflecting distinct physiological patterns across individuals and activities. Existing methods either discard device-specific information, or fail to model user-specific differences, limiting their real-world performance. To address this, we propose a framework that learns latent representations agnostic to both heterogeneity,enabling downstream predictors to work consistently under heterogeneous data patterns. Specifically, we introduce a random feature dropout strategy to handle source heterogeneity, making the model robust to various feature sets. To manage user heterogeneity, we employ a history-aware attention module to capture long-term physiological traits and use a contrastive learning objective to build a discriminative representation space. To reflect the heterogeneous nature of real-world data, we created a new benchmark dataset, PARROTAO. Evaluations on both PARROTAO and the public FitRec dataset show that our model significantly outperforms existing baselines by 17.5% and 10.4% in terms of test MSE, respectively. Furthermore, analysis of the learned representations demonstrates their strong discriminative power,and two downstream application tasks confirm the practical value of our model.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21785/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21785/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/2508.21785/full.md

---
Source: https://tomesphere.com/paper/2508.21785