CASA: CNN Autoencoder-based Score Attention for Efficient Multivariate Long-term Time-series Forecasting
Minhyuk Lee, HyeKyung Yoon, MyungJoo Kang

TL;DR
This paper introduces CASA, a CNN Autoencoder-based Score Attention mechanism that enhances multivariate long-term time series forecasting by reducing computational costs and improving accuracy across diverse datasets.
Contribution
CASA is a novel, model-agnostic attention mechanism that significantly reduces memory usage and accelerates inference in Transformer-based models for time series forecasting.
Findings
CASA reduces memory consumption by up to 77.7%.
CASA accelerates inference speed by 44.0%.
CASA achieves state-of-the-art performance on 87.5% of metrics.
Abstract
Multivariate long-term time series forecasting is critical for applications such as weather prediction, and traffic analysis. In addition, the implementation of Transformer variants has improved prediction accuracy. Following these variants, different input data process approaches also enhanced the field, such as tokenization techniques including point-wise, channel-wise, and patch-wise tokenization. However, previous studies still have limitations in time complexity, computational resources, and cross-dimensional interactions. To address these limitations, we introduce a novel CNN Autoencoder-based Score Attention mechanism (CASA), which can be introduced in diverse Transformers model-agnosticically by reducing memory and leading to improvement in model performance. Experiments on eight real-world datasets validate that CASA decreases computational resources by up to 77.7%, accelerates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Neural Networks and Applications · Stock Market Forecasting Methods
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Dense Connections · Adam · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Softmax
