Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
Mingfang Zhang, Yifei Huang, Ruicong Liu, Yoichi Sato

TL;DR
This paper introduces a novel self-supervised learning approach combining IMU motion data and egocentric video for improved action recognition, leveraging graph-based IMU modeling and MAE pretraining to enhance robustness and performance.
Contribution
It proposes a new multimodal action recognition method integrating IMU and video data with MAE pretraining and graph-based IMU modeling, addressing data scarcity and device variability.
Findings
Achieves state-of-the-art results on multiple datasets.
Effective in scenarios with missing IMUs and video corruption.
Validates the benefits of MAE pretraining and graph modeling.
Abstract
Compared with visual signals, Inertial Measurement Units (IMUs) placed on human limbs can capture accurate motion signals while being robust to lighting variation and occlusion. While these characteristics are intuitively valuable to help egocentric action recognition, the potential of IMUs remains under-explored. In this work, we present a novel method for action recognition that integrates motion data from body-worn IMUs with egocentric video. Due to the scarcity of labeled multimodal data, we design an MAE-based self-supervised pretraining method, obtaining strong multi-modal representations via modeling the natural correlation between visual and motion signals. To model the complex relation of multiple IMU devices placed across the body, we exploit the collaborative dynamics in multiple IMU devices and propose to embed the relative motion features of human joints into a graph…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Gait Recognition and Analysis · Anomaly Detection Techniques and Applications
