MuJo: Multimodal Joint Feature Space Learning for Human Activity Recognition
Stefan Gerd Fritsch, Cennet Oguz, Vitor Fortes Rey, Lala Ray,, Maximilian Kiefer-Emmanouilidis, Paul Lukowicz

TL;DR
This paper introduces MuJo, a pre-training method using a new multimodal dataset FiMAD, to improve human activity recognition across various sensor and video modalities, especially with limited labeled data.
Contribution
The paper presents MuJo, a novel joint feature space learning approach using FiMAD, a comprehensive multimodal dataset, to enhance HAR performance and data efficiency across multiple datasets.
Findings
Pre-trained classifiers on FiMAD improve HAR accuracy.
MuJo outperforms other self-supervised methods in data efficiency.
Achieves high F1-Scores with limited training data.
Abstract
Human activity recognition (HAR) is a long-standing problem in artificial intelligence with applications in a broad range of areas, including healthcare, sports and fitness, security, and more. The performance of HAR in real-world settings is strongly dependent on the type and quality of the input signal that can be acquired. Given an unobstructed, high-quality camera view of a scene, computer vision systems, in particular in conjunction with foundation models, can today fairly reliably distinguish complex activities. On the other hand, recognition using modalities such as wearable sensors (which are often more broadly available, e.g., in mobile phones and smartwatches) is a more difficult problem, as the signals often contain less information and labeled training data is more difficult to acquire. To alleviate the need for labeled data, we introduce our comprehensive Fitness Multimodal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems · Human Pose and Action Recognition · Anomaly Detection Techniques and Applications
MethodsSparse Evolutionary Training
