VSFormer: Value and Shape-Aware Transformer with Prior-Enhanced Self-Attention for Multivariate Time Series Classification
Wenjie Xi, Rundong Zuo, Alejandro Alvarez, Jie Zhang, Byron Choi,, Jessica Lin

TL;DR
VSFormer is a novel Transformer-based model for multivariate time series classification that leverages value and shape features along with prior information to improve accuracy, especially in data lacking clear discriminative patterns.
Contribution
The paper introduces VSFormer, which combines shape and value features with prior-enhanced self-attention for improved time series classification accuracy.
Findings
Outperforms state-of-the-art models on 30 UEA datasets.
Effective ablation results for encoding and attention mechanisms.
Case study demonstrates interpretability on real-world data.
Abstract
Multivariate time series classification is a crucial task in data mining, attracting growing research interest due to its broad applications. While many existing methods focus on discovering discriminative patterns in time series, real-world data does not always present such patterns, and sometimes raw numerical values can also serve as discriminative features. Additionally, the recent success of Transformer models has inspired many studies. However, when applying to time series classification, the self-attention mechanisms in Transformer models could introduce classification-irrelevant features, thereby compromising accuracy. To address these challenges, we propose a novel method, VSFormer, that incorporates both discriminative patterns (shape) and numerical information (value). In addition, we extract class-specific prior information derived from supervised information to enrich the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Absolute Position Encodings · Dense Connections · Multi-Head Attention · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection · Adam
