GVT2RPM: An Empirical Study for General Video Transformer Adaptation to   Remote Physiological Measurement

Hao Wang; Euijoon Ahn; Jinman Kim

arXiv:2406.13136·cs.CV·June 21, 2024

GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement

Hao Wang, Euijoon Ahn, Jinman Kim

PDF

Open Access

TL;DR

This paper empirically investigates how general video transformer architectures can be adapted for remote physiological measurement from facial videos, proposing guidelines that improve robustness without specialized modules.

Contribution

It introduces practical guidelines for adapting general video transformers to RPM, eliminating the need for RPM-specific modules and enhancing robustness across datasets.

Findings

01

GVT2RPM achieves comparable or better accuracy than RPM-specific methods.

02

The proposed guidelines generalize across different video transformer architectures.

03

The method demonstrates robustness in intra- and cross-dataset evaluations.

Abstract

Remote physiological measurement (RPM) is an essential tool for healthcare monitoring as it enables the measurement of physiological signs, e.g., heart rate, in a remote setting via physical wearables. Recently, with facial videos, we have seen rapid advancements in video-based RPMs. However, adopting facial videos for RPM in the clinical setting largely depends on the accuracy and robustness (work across patient populations). Fortunately, the capability of the state-of-the-art transformer architecture in general (natural) video understanding has resulted in marked improvements and has been translated to facial understanding, including RPM. However, existing RPM methods usually need RPM-specific modules, e.g., temporal difference convolution and handcrafted feature maps. Although these customized modules can increase accuracy, they are not demonstrated for their robustness across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Computing and Algorithms · Industrial Vision Systems and Defect Detection · Muscle activation and electromyography studies

MethodsALIGN · Convolution