A Mamba Foundation Model for Time Series Forecasting
Haoyu Ma, Yushu Chen, Wenlai Zhao, Jinzhe Yang, Yingsheng Ji, Xinghua, Xu, Xiaozhu Liu, Hao Jing, Shengzhuo Liu, Guangwen Yang

TL;DR
This paper introduces TSMamba, a linear-complexity time series foundation model that leverages transfer learning and a novel architecture to achieve high accuracy with less data and computational efficiency.
Contribution
The paper presents TSMamba, a new time series foundation model with linear complexity built on the Mamba architecture, enabling effective zero-shot and few-shot forecasting.
Findings
Zero-shot performance comparable to state-of-the-art models
Achieves high accuracy with less training data
Outperforms task-specific models in full-shot settings
Abstract
Time series foundation models have demonstrated strong performance in zero-shot learning, making them well-suited for predicting rapidly evolving patterns in real-world applications where relevant training data are scarce. However, most of these models rely on the Transformer architecture, which incurs quadratic complexity as input length increases. To address this, we introduce TSMamba, a linear-complexity foundation model for time series forecasting built on the Mamba architecture. The model captures temporal dependencies through both forward and backward Mamba encoders, achieving high prediction accuracy. To reduce reliance on large datasets and lower training costs, TSMamba employs a two-stage transfer learning process that leverages pretrained Mamba LLMs, allowing effective time series modeling with a moderate training set. In the first stage, the forward and backward backbones are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForecasting Techniques and Applications · Stock Market Forecasting Methods · Modeling, Simulation, and Optimization
MethodsLinear Layer · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Attention Is All You Need · Multi-Head Attention · Residual Connection · Byte Pair Encoding · Dropout · Absolute Position Encodings
