Seeking SOTA: Time-Series Forecasting Must Adopt Taxonomy-Specific Evaluation to Dispel Illusory Gains

Raeid Saqur; Christoph Bergmeir; Blanka Horvath; Daniel Schmidt; Frank Rudzicz; Terry Lyons

arXiv:2603.15506·cs.LG·March 17, 2026

Seeking SOTA: Time-Series Forecasting Must Adopt Taxonomy-Specific Evaluation to Dispel Illusory Gains

Raeid Saqur, Christoph Bergmeir, Blanka Horvath, Daniel Schmidt, Frank Rudzicz, Terry Lyons

PDF

Open Access

TL;DR

This paper emphasizes the need for more diverse and representative benchmarks in time-series forecasting to accurately assess the true progress of models, especially in distinguishing complex models from classical methods.

Contribution

It advocates for adopting diverse datasets and baseline comparisons in evaluations to prevent misleading claims of superiority by complex models.

Findings

01

Classical models perform comparably to deep learning on standard datasets.

02

Current benchmarks often favor models capturing periodicities.

03

Diverse datasets reveal the limitations of existing models.

Abstract

We argue that the current practice of evaluating AI/ML time-series forecasting models, predominantly on benchmarks characterized by strong, persistent periodicities and seasonalities, obscures real progress by overlooking the performance of efficient classical methods. We demonstrate that these "standard" datasets often exhibit dominant autocorrelation patterns and seasonal cycles that can be effectively captured by simpler linear or statistical models, rendering complex deep learning architectures frequently no more performant than their classical counterparts for these specific data characteristics, and raising questions as to whether any marginal improvements justify the significant increase in computational overhead and model complexity. We call on the community to (I) retire or substantially augment current benchmarks with datasets exhibiting a wider spectrum of non-stationarities,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Data Stream Mining Techniques · Forecasting Techniques and Applications