Periodic-MAE: Periodic Video Masked Autoencoder for rPPG Estimation
Jiho Choi, Sang Jun Lee

TL;DR
This paper introduces Periodic-MAE, a self-supervised learning framework using masked autoencoders to capture periodic signals in facial videos for improved remote PPG estimation, especially across diverse datasets.
Contribution
It proposes a novel periodic video masked autoencoder that learns high-dimensional representations of facial signals, incorporating physiological constraints for better rPPG estimation.
Findings
Significant performance improvements on multiple datasets.
Enhanced cross-dataset generalization.
Effective capture of quasi-periodic physiological signals.
Abstract
In this paper, we propose a method that learns a general representation of periodic signals from unlabeled facial videos by capturing subtle changes in skin tone over time. The proposed framework employs the video masked autoencoder to learn a high-dimensional spatio-temporal representation of the facial region through self-supervised learning. Capturing quasi-periodic signals in the video is crucial for remote photoplethysmography (rPPG) estimation. To account for signal periodicity, we apply frame masking in terms of video sampling, which allows the model to capture resampled quasi-periodic signals during the pre-training stage. Moreover, the framework incorporates physiological bandlimit constraints, leveraging the property that physiological signals are sparse within their frequency bandwidth to provide pulse cues to the model. The pre-trained encoder is then transferred to the rPPG…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNon-Invasive Vital Sign Monitoring · Optical Imaging and Spectroscopy Techniques · Emotion and Mood Recognition
