Multi-View Fusion Transformer for Sensor-Based Human Activity Recognition
Yimu Wang, Kun Yu, Yan Wang, Hui Xue

TL;DR
This paper introduces a multi-view fusion transformer that combines temporal, frequent, and statistical data views with a novel attention mechanism to improve sensor-based human activity recognition accuracy.
Contribution
It proposes a novel multi-view fusion transformer with an innovative attention mechanism to better integrate diverse data views for HAR.
Findings
Outperforms several state-of-the-art methods on two datasets.
Effectively captures inter- and intra-view relations.
Enhances feature extraction for activity recognition.
Abstract
As a fundamental problem in ubiquitous computing and machine learning, sensor-based human activity recognition (HAR) has drawn extensive attention and made great progress in recent years. HAR aims to recognize human activities based on the availability of rich time-series data collected from multi-modal sensors such as accelerometers and gyroscopes. However, recent deep learning methods are focusing on one view of the data, i.e., the temporal view, while shallow methods tend to utilize the hand-craft features for recognition, e.g., the statistics view. In this paper, to extract a better feature for advancing the performance, we propose a novel method, namely multi-view fusion transformer (MVFT) along with a novel attention mechanism. First, MVFT encodes three views of information, i.e., the temporal, frequent, and statistical views to generate multi-view features. Second, the novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems
