Rethinking Attention Mechanism in Time Series Classification
Bowen Zhao, Huanlai Xing, Xinhan Wang, Fuhong Song, Zhiwen Xiao

TL;DR
This paper introduces a flexible multi-head linear attention mechanism for time series classification that improves efficiency and performance by reducing complexity and noise influence, validated through extensive experiments.
Contribution
The paper proposes a novel multi-head linear attention method with a masking mechanism for better efficiency and noise reduction in time series classification.
Findings
Achieves comparable accuracy to state-of-the-art methods on 85 datasets.
Significantly reduces computational complexity compared to Transformer models.
Demonstrates improved efficiency with lower floating-point operations and fewer parameters.
Abstract
Attention-based models have been widely used in many areas, such as computer vision and natural language processing. However, relevant applications in time series classification (TSC) have not been explored deeply yet, causing a significant number of TSC algorithms still suffer from general problems of attention mechanism, like quadratic complexity. In this paper, we promote the efficiency and performance of the attention mechanism by proposing our flexible multi-head linear attention (FMLA), which enhances locality awareness by layer-wise interactions with deformable convolutional blocks and online knowledge distillation. What's more, we propose a simple but effective mask mechanism that helps reduce the noise influence in time series and decrease the redundancy of the proposed FMLA by masking some positions of each given series proportionally. To stabilize this mechanism, samples are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Anomaly Detection Techniques and Applications · EEG and Brain-Computer Interfaces
MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Linear Attention
