Three-Stage Learning Unlocks Strong Performance in Simple Models for Long-Term Time Series Forecasting

Zhenan Yu; Guangxin Jiang; Jin Yang

arXiv:2605.13678·cs.LG·May 14, 2026

Three-Stage Learning Unlocks Strong Performance in Simple Models for Long-Term Time Series Forecasting

Zhenan Yu, Guangxin Jiang, Jin Yang

PDF

TL;DR

This paper introduces STAIR, a three-stage training paradigm that enhances simple models for long-term time series forecasting by progressively capturing shared and variable-specific dynamics without complex architectures.

Contribution

The paper proposes a novel three-stage training framework, STAIR, that improves simple temporal models for long-term forecasting by structured training and residual learning, avoiding complex modules.

Findings

01

STAIR matches or outperforms recent strong baselines on nine benchmarks.

02

The approach effectively captures shared and variable-specific dynamics.

03

Maintains simplicity of the core temporal predictor.

Abstract

Recent studies on long-term time series forecasting have shown that simple linear models and MLP-based predictors can achieve strong performance without increasingly complex architectures. However, many competitive baselines still rely on structural priors such as frequency-domain modeling, explicit decomposition, multi-scale mixing, or sophisticated cross-variable interaction modules, while paying less attention to how simple temporal mappings should be trained and organized. In this paper, we propose STAIR, short for Stagewise Temporal Adaptation via Individualization and Residual Learning, a training paradigm for long-term time series forecasting that aims to unlock the capacity of simple temporal mapping models without introducing complex architectural modules. STAIR decomposes forecasting ability into three progressive stages: it first learns common temporal dynamics across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.