Implicit Reasoning in Deep Time Series Forecasting
Willa Potosnak, Cristian Challu, Mononito Goswami, Micha{\l}, Wili\'nski, Nina \.Zukowska, Artur Dubrawski

TL;DR
This paper investigates whether deep time series forecasting models demonstrate genuine reasoning abilities or rely on memorization, finding evidence of some reasoning capabilities in certain models through out-of-distribution generalization tests.
Contribution
It introduces an initial assessment of reasoning in deep time series models, highlighting their potential to generalize beyond memorization in OOD scenarios.
Findings
Certain models generalize effectively in OOD scenarios
Evidence of reasoning capabilities beyond memorization
Transformers and MLPs show promising generalization
Abstract
Recently, time series foundation models have shown promising zero-shot forecasting performance on time series from a wide range of domains. However, it remains unclear whether their success stems from a true understanding of temporal dynamics or simply from memorizing the training data. While implicit reasoning in language models has been studied, similar evaluations for time series models have been largely unexplored. This work takes an initial step toward assessing the reasoning abilities of deep time series forecasting models. We find that certain linear, MLP-based, and patch-based Transformer models generalize effectively in systematically orchestrated out-of-distribution scenarios, suggesting underexplored reasoning capabilities beyond simple pattern memorization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Stock Market Forecasting Methods
MethodsLinear Layer · Multi-Head Attention · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Softmax · Layer Normalization · Attention Is All You Need · Position-Wise Feed-Forward Layer · Dropout
