Improving Out-of-distribution Human Activity Recognition via IMU-Video Cross-modal Representation Learning

Seyyed Saeid Cheshmi; Buyao Lyu; Thomas Lisko; Rajesh Rajamani; Robert A. McGovern; Yogatheesan Varatharajah

arXiv:2507.13482·cs.LG·July 21, 2025

Improving Out-of-distribution Human Activity Recognition via IMU-Video Cross-modal Representation Learning

Seyyed Saeid Cheshmi, Buyao Lyu, Thomas Lisko, Rajesh Rajamani, Robert A. McGovern, Yogatheesan Varatharajah

PDF

Open Access

TL;DR

This paper introduces a cross-modal self-supervised pretraining method using IMU-video data to improve out-of-distribution human activity recognition, especially in clinical settings like Parkinson's disease, demonstrating superior generalization over existing methods.

Contribution

The study presents a novel cross-modal pretraining approach that enhances the generalizability of HAR models to unseen data and environments, outperforming existing IMU-only and IMU-video pretraining techniques.

Findings

01

Outperforms state-of-the-art IMU-video pretraining in zero-shot and few-shot settings.

02

Improves generalization of HAR models to out-of-distribution datasets.

03

Effective in clinical scenarios like Parkinson's disease monitoring.

Abstract

Human Activity Recognition (HAR) based on wearable inertial sensors plays a critical role in remote health monitoring. In patients with movement disorders, the ability to detect abnormal patient movements in their home environments can enable continuous optimization of treatments and help alert caretakers as needed. Machine learning approaches have been proposed for HAR tasks using Inertial Measurement Unit (IMU) data; however, most rely on application-specific labels and lack generalizability to data collected in different environments or populations. To address this limitation, we propose a new cross-modal self-supervised pretraining approach to learn representations from large-sale unlabeled IMU-video data and demonstrate improved generalizability in HAR tasks on out of distribution (OOD) IMU datasets, including a dataset collected from patients with Parkinson's disease.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsContext-Aware Activity Recognition Systems · Human Pose and Action Recognition · Balance, Gait, and Falls Prevention