Gated Transformer Networks for Multivariate Time Series Classification
Minghao Liu, Shengqi Ren, Siyuan Ma, Jiahui Jiao, Yizhou Chen,, Zhiguang Wang, Wei Song

TL;DR
This paper introduces Gated Transformer Networks (GTN), an extension of Transformer models with gating mechanisms, tailored for multivariate time series classification, demonstrating competitive performance and interpretability on multiple datasets.
Contribution
The paper proposes Gated Transformer Networks with a novel gating mechanism that effectively models channel-wise and step-wise correlations in multivariate time series.
Findings
GTN achieves competitive results with state-of-the-art models.
Gating improves modeling of multivariate correlations.
Attention maps provide interpretability for time series analysis.
Abstract
Deep learning model (primarily convolutional networks and LSTM) for time series classification has been studied broadly by the community with the wide applications in different domains like healthcare, finance, industrial engineering and IoT. Meanwhile, Transformer Networks recently achieved frontier performance on various natural language processing and computer vision tasks. In this work, we explored a simple extension of the current Transformer Networks with gating, named Gated Transformer Networks (GTN) for the multivariate time series classification problem. With the gating that merges two towers of Transformer which model the channel-wise and step-wise correlations respectively, we show how GTN is naturally and effectively suitable for the multivariate time series classification task. We conduct comprehensive experiments on thirteen dataset with full ablation study. Our results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Anomaly Detection Techniques and Applications · Complex Systems and Time Series Analysis
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam · Softmax · Dense Connections · Byte Pair Encoding · Dropout · Residual Connection · Layer Normalization
