Understanding the Implicit Biases of Design Choices for Time Series Foundation Models
Annan Yu, Danielle C. Maddix, Boran Han, Xiyuan Zhang, Abdul Fatir Ansari, Oleksandr Shchur, Christos Faloutsos, Andrew Gordon Wilson, Michael W. Mahoney, Yuyang Wang

TL;DR
This paper investigates how various design choices in time series foundation models influence their implicit biases and behavior, combining theory and experiments to understand their effects on model properties and data interactions.
Contribution
It provides a systematic analysis of how training design choices induce implicit biases in TSFMs, offering insights into their effects on model behavior and data interaction.
Findings
Design choices affect temporal and geometric properties of models.
Implicit biases can be intuitive or counterintuitive.
Multiple biases interact complexly in outlier handling.
Abstract
Time series foundation models (TSFMs) are a class of potentially powerful, general-purpose tools for time series forecasting and related temporal tasks, but their behavior is strongly shaped by subtle inductive biases in their design. Rather than developing a new model and claiming that it is better than existing TSFMs, e.g., by winning on existing well-established benchmarks, our objective is to understand how the various ``knobs'' of the training process affect model quality. Using a mix of theory and controlled empirical evaluation, we identify several design choices (patch size, embedding choice, training objective, etc.) and show how they lead to implicit biases in fundamental model properties (temporal behavior, geometric structure, how aggressively or not the model regresses to the mean, etc.); and we show how these biases can be intuitive or very counterintuitive, depending on…
Peer Reviews
Decision·ICLR 2026 Poster
The paper addresses a critical need in the TSFM literature. By shifting the focus from "what is SOTA" to "why models behave the way they do," it provides lasting insights that will remain relevant even as new models emerge. This is the kind of scientific inquiry that fosters deeper understanding. The combination of theory, controlled synthetic experiments, and analysis on real data is a major strength. The use of Chronos vs. Chronos-Bolt as a primary case study is an elegant experimental design
While the focus on Chronos/Chronos-Bolt is a strength for control, it is also a potential weakness for the generality of the conclusions. The paper does include other models like TimesFM and Moirai in some experiments (which is great!), but the core narrative and many of the detailed analyses are tightly coupled to the Chronos family. It would strengthen the paper to either include more direct evidence from a wider variety of architectures or to more explicitly frame the conclusions in terms of
* The paper studies interesting and important phenomena in time series forecasting models. * The findings are particularly relevant in the context of foundation models. * I appreciate the focus on clarifying the impact of design choices that are often overlooked. * The empirical analysis is interesting and well-designed.
### Main weaknesses The writing and presentation could be improved, and several claims would benefit from additional discussion and supporting evidence. - The introduction could do a better job summarizing the main takeaways of the paper and explaining why the discussed phenomena are particularly important in the context of foundation models (i.e., when learning a transferable model). Some sentences are difficult to contextualize without having read the full paper — for example: “that time is
* The topic and analysis are novel. * I expect this paper to be highly significant for time-series modelling relying on deep learning in general and not only for research in time-series foundation models. * The paper is extremely thorough. * The structure of the paper is nice. * I believe the conclusion to be stronger than what is stated in the paper (see question section).
While the writing generally is good, this paper has some issues that I would like to point out. * The different biases are not defined explicitly in text. For example, what is meant by temporal and geometric bases? Please specify. What is the frequency bias and the periodicity bias? What are the angels, distances and norms biases? Explicitly state this, preferably right after the bold heading for each term. * For temporal bias, frequency, periodicity and seasonality are mentioned, but while fr
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForecasting Techniques and Applications · Time Series Analysis and Forecasting · Data Visualization and Analytics
