RhythmNet: End-to-end Heart Rate Estimation from Face via Spatial-temporal Representation
Xuesong Niu, Shiguang Shan, Hu Han, Xilin Chen

TL;DR
RhythmNet is an end-to-end deep learning model that estimates heart rate from face videos using spatial-temporal representations, demonstrating improved accuracy in unconstrained scenarios and leveraging a new large-scale database.
Contribution
The paper introduces RhythmNet, a novel deep learning framework for remote HR estimation, and provides a large-scale, diverse VIPL-HR database for training and evaluation.
Findings
Outperforms state-of-the-art methods on multiple datasets.
Effective in scenarios with head movement and poor illumination.
Utilizes a new large-scale, diverse HR database.
Abstract
Heart rate (HR) is an important physiological signal that reflects the physical and emotional status of a person. Traditional HR measurements usually rely on contact monitors, which may cause inconvenience and discomfort. Recently, some methods have been proposed for remote HR estimation from face videos; however, most of them focus on well-controlled scenarios, their generalization ability into less-constrained scenarios (e.g., with head movement, and bad illumination) are not known. At the same time, lacking large-scale HR databases has limited the use of deep models for remote HR estimation. In this paper, we propose an end-to-end RhythmNet for remote HR estimation from the face. In RyhthmNet, we use a spatial-temporal representation encoding the HR signals from multiple ROI volumes as its input. Then the spatial-temporal representations are fed into a convolutional network for HR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
