Detecting Informative Channels: ActionFormer
Kunpeng Zhao, Asahi Miyazaki, Tsuyoshi Okita

TL;DR
This paper enhances the ActionFormer model for Human Activity Recognition by adapting it for sensor signals, addressing temporal dynamics and feature interdependencies, resulting in significant performance improvements on inertial data.
Contribution
The paper introduces a modified ActionFormer architecture with Sequence-and-Excitation and Swish activation, improving HAR accuracy for sensor signals.
Findings
Achieved 16.01% increase in average mAP on WEAR dataset.
Effectively captures subtle activity changes with reduced model defects.
Demonstrates the effectiveness of the proposed modifications in sensor-based HAR.
Abstract
Human Activity Recognition (HAR) has recently witnessed advancements with Transformer-based models. Especially, ActionFormer shows us a new perspectives for HAR in the sense that this approach gives us additional outputs which detect the border of the activities as well as the activity labels. ActionFormer was originally proposed with its input as image/video. However, this was converted to with its input as sensor signals as well. We analyze this extensively in terms of deep learning architectures. Based on the report of high temporal dynamics which limits the model's ability to capture subtle changes effectively and of the interdependencies between the spatial and temporal features. We propose the modified ActionFormer which will decrease these defects for sensor signals. The key to our approach lies in accordance with the Sequence-and-Excitation strategy to minimize the increase in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms · Benford’s Law and Fraud Detection
