A Mamba Foundation Model for Time Series Forecasting

Haoyu Ma; Yushu Chen; Wenlai Zhao; Jinzhe Yang; Yingsheng Ji; Xinghua; Xu; Xiaozhu Liu; Hao Jing; Shengzhuo Liu; Guangwen Yang

arXiv:2411.02941·cs.LG·November 6, 2024·3 cites

A Mamba Foundation Model for Time Series Forecasting

Haoyu Ma, Yushu Chen, Wenlai Zhao, Jinzhe Yang, Yingsheng Ji, Xinghua, Xu, Xiaozhu Liu, Hao Jing, Shengzhuo Liu, Guangwen Yang

PDF

Open Access

TL;DR

This paper introduces TSMamba, a linear-complexity time series foundation model that leverages transfer learning and a novel architecture to achieve high accuracy with less data and computational efficiency.

Contribution

The paper presents TSMamba, a new time series foundation model with linear complexity built on the Mamba architecture, enabling effective zero-shot and few-shot forecasting.

Findings

01

Zero-shot performance comparable to state-of-the-art models

02

Achieves high accuracy with less training data

03

Outperforms task-specific models in full-shot settings

Abstract

Time series foundation models have demonstrated strong performance in zero-shot learning, making them well-suited for predicting rapidly evolving patterns in real-world applications where relevant training data are scarce. However, most of these models rely on the Transformer architecture, which incurs quadratic complexity as input length increases. To address this, we introduce TSMamba, a linear-complexity foundation model for time series forecasting built on the Mamba architecture. The model captures temporal dependencies through both forward and backward Mamba encoders, achieving high prediction accuracy. To reduce reliance on large datasets and lower training costs, TSMamba employs a two-stage transfer learning process that leverages pretrained Mamba LLMs, allowing effective time series modeling with a moderate training set. In the first stage, the forward and backward backbones are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForecasting Techniques and Applications · Stock Market Forecasting Methods · Modeling, Simulation, and Optimization

MethodsLinear Layer · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Attention Is All You Need · Multi-Head Attention · Residual Connection · Byte Pair Encoding · Dropout · Absolute Position Encodings