Investigating Compositional Reasoning in Time Series Foundation Models
Willa Potosnak, Cristian Challu, Mononito Goswami, Kin G. Olivares, Micha{\l} Wili\'nski, Nina \.Zukowska, Artur Dubrawski

TL;DR
This paper investigates whether large time series models reason about data patterns or memorize, defining compositional reasoning, evaluating models' reasoning abilities, and analyzing how architecture choices affect their generalization and reasoning performance.
Contribution
It formally defines compositional reasoning for time series forecasting, evaluates multiple models' reasoning capabilities, and identifies key architectural factors influencing reasoning and generalization.
Findings
Patch-based Transformers excel in reasoning performance.
Residualized MLP architectures are nearly as good but more efficient.
Some models outperform traditional statistical baselines in out-of-distribution scenarios.
Abstract
Large pre-trained time series foundation models (TSFMs) have demonstrated promising zero-shot performance across a wide range of domains. However, a question remains: Do TSFMs succeed by memorizing patterns in training data, or do they possess the ability to reason about such patterns? While reasoning is a topic of great interest in the study of Large Language Models (LLMs), it is undefined and largely unexplored in the context of TSFMs. In this work, inspired by language modeling literature, we formally define compositional reasoning in forecasting and distinguish it from in-distribution generalization. We evaluate the reasoning and generalization capabilities of 16 popular deep learning forecasting models on multiple synthetic and real-world datasets. Additionally, through controlled studies, we systematically examine which design choices in 7 popular open-source TSFMs contribute to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeochemistry and Geologic Mapping
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Softmax · Absolute Position Encodings · Dropout · Label Smoothing · Byte Pair Encoding
