Seeking SOTA: Time-Series Forecasting Must Adopt Taxonomy-Specific Evaluation to Dispel Illusory Gains
Raeid Saqur, Christoph Bergmeir, Blanka Horvath, Daniel Schmidt, Frank Rudzicz, Terry Lyons

TL;DR
This paper emphasizes the need for more diverse and representative benchmarks in time-series forecasting to accurately assess the true progress of models, especially in distinguishing complex models from classical methods.
Contribution
It advocates for adopting diverse datasets and baseline comparisons in evaluations to prevent misleading claims of superiority by complex models.
Findings
Classical models perform comparably to deep learning on standard datasets.
Current benchmarks often favor models capturing periodicities.
Diverse datasets reveal the limitations of existing models.
Abstract
We argue that the current practice of evaluating AI/ML time-series forecasting models, predominantly on benchmarks characterized by strong, persistent periodicities and seasonalities, obscures real progress by overlooking the performance of efficient classical methods. We demonstrate that these "standard" datasets often exhibit dominant autocorrelation patterns and seasonal cycles that can be effectively captured by simpler linear or statistical models, rendering complex deep learning architectures frequently no more performant than their classical counterparts for these specific data characteristics, and raising questions as to whether any marginal improvements justify the significant increase in computational overhead and model complexity. We call on the community to (I) retire or substantially augment current benchmarks with datasets exhibiting a wider spectrum of non-stationarities,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Data Stream Mining Techniques · Forecasting Techniques and Applications
