Periodic-MAE: Periodic Video Masked Autoencoder for rPPG Estimation

Jiho Choi; Sang Jun Lee

arXiv:2506.21855·cs.CV·June 30, 2025

Periodic-MAE: Periodic Video Masked Autoencoder for rPPG Estimation

Jiho Choi, Sang Jun Lee

PDF

Open Access

TL;DR

This paper introduces Periodic-MAE, a self-supervised learning framework using masked autoencoders to capture periodic signals in facial videos for improved remote PPG estimation, especially across diverse datasets.

Contribution

It proposes a novel periodic video masked autoencoder that learns high-dimensional representations of facial signals, incorporating physiological constraints for better rPPG estimation.

Findings

01

Significant performance improvements on multiple datasets.

02

Enhanced cross-dataset generalization.

03

Effective capture of quasi-periodic physiological signals.

Abstract

In this paper, we propose a method that learns a general representation of periodic signals from unlabeled facial videos by capturing subtle changes in skin tone over time. The proposed framework employs the video masked autoencoder to learn a high-dimensional spatio-temporal representation of the facial region through self-supervised learning. Capturing quasi-periodic signals in the video is crucial for remote photoplethysmography (rPPG) estimation. To account for signal periodicity, we apply frame masking in terms of video sampling, which allows the model to capture resampled quasi-periodic signals during the pre-training stage. Moreover, the framework incorporates physiological bandlimit constraints, leveraging the property that physiological signals are sparse within their frequency bandwidth to provide pulse cues to the model. The pre-trained encoder is then transferred to the rPPG…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNon-Invasive Vital Sign Monitoring · Optical Imaging and Spectroscopy Techniques · Emotion and Mood Recognition