Lightweight and Generalizable Multi-Sensor Human Activity Recognition via Cascaded Fusion and Style-Augmented Decomposition
Wang Chenglong, Zhuo Yan, Ding Wenbo, Chen Xinlei

TL;DR
This paper introduces a lightweight, generalizable multi-sensor human activity recognition framework that improves efficiency and robustness through cascaded fusion and style-based data augmentation, outperforming state-of-the-art methods.
Contribution
The authors propose a novel cascaded fusion block and style augmentation techniques that reduce computational complexity and enhance generalization in wearable human activity recognition.
Findings
Outperforms state-of-the-art methods in accuracy and macro-F1 score.
Reduces computational overhead by over 30% compared to attention-based models.
Maintains robustness to data variations through style augmentation.
Abstract
Wearable Human Activity Recognition (WHAR) is a prominent research area within ubiquitous computing, whose core lies in effectively modeling intra- and inter-sensor spatio-temporal relationships from multi-modal time series data. Existing methods either suffer from high computational complexity due to attention-based fusion or lack robustness to data variations during feature extraction. To address these issues, we propose a lightweight and generalizable framework that retains the core "decomposition-extraction-fusion" paradigm while introducing two key innovations. First, we replace the computationally expensive Attention and Cross-Variable Fusion (CVF) modules with a Cascaded Fusion Block (CFB), which achieves efficient feature interaction without explicit attention weights through the operational process of "compression-recursion-concatenation-fusion". Second, we integrate a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
