TL;DR
TempoPFN introduces a synthetic pre-trained linear RNN model for zero-shot time series forecasting, achieving competitive results efficiently without real data, and offers a reproducible pipeline for future research.
Contribution
The paper presents TempoPFN, a novel synthetic pre-training approach for linear RNNs that enables efficient zero-shot long-horizon time series forecasting.
Findings
Outperforms existing synthetic-only models on benchmarks.
Surpasses most real-data trained models in zero-shot settings.
Offers a fully parallelizable training and inference process.
Abstract
Foundation models for zero-shot time series forecasting face challenges in efficient long-horizon prediction and reproducibility, with existing synthetic-only approaches underperforming on challenging benchmarks. This paper presents TempoPFN, a univariate time series foundation model based on linear Recurrent Neural Networks (RNNs) pre-trained exclusively on synthetic data. The model uses a GatedDeltaProduct architecture with state-weaving for fully parallelizable training across sequence lengths, eliminating the need for windowing or summarization techniques while maintaining robust temporal state-tracking. Our comprehensive synthetic data pipeline unifies diverse generators, including stochastic differential equations, Gaussian processes, and audio synthesis, with novel augmentations. In zero-shot evaluations on the Gift-Eval, fev-bench and Chronos-ZS benchmarks, TempoPFN achieves…
Peer Reviews
Decision·Submitted to ICLR 2026
1. Presentation of the paper is good. 2. The synthetic data generation pipeline is pretty exhaustive, and covers many types of synthetic data. 3. The author(s) promise to open source the codes and pipelines.
However, the paper has a few weaknesses. 1. The novelty of the paper is limited. First, linear RNNs are not new, they are adopted from the literature. Training models purely on synthetic data is not new; the paper just creates a more diverse set of synthetic data. Given the idea of TabPFN or ForecastPFN, the proposed work can be tried quite trivially, without much complications. 2. The performance of the model is not up to the mark and marginally better than TabPFN-TS in CRPS and (from 0.544 to
- Unlike the currently dominant Transformer architecture, this paper employs a linear RNN as its framework, achieving performance comparable to models with hundreds of millions of parameters while maintaining under 35M parameters itself. - The paper introduces a comprehensive approach to data synthesis and augmentation, enabling the complete elimination of real-world data for model training. Combined with the compact model architecture, this significantly lowers the barrier for training and depl
- The paper employs GatedDeltaProduct, which is a relatively novel network architecture. However, the paper lacks a clear explanation of this architecture and justification for its selection. - The authors mention the concepts of "state tracking" and "state weaving". However, the authors neither define these terms nor clarify their relevance to time series forecasting tasks. This reflects a broader trend in the field, where methodologies from LLM and computer vision are frequently adopted witho
The model of the authors is trained exclusively on synthetic data which is advantageous for zero-shot generalization on various benchmark tasks. Also, the authors open-source their data processing and generation pipeline, which the authors can really appreciate, as it allows others to quickly build on top of this work and test out new research ideas.
Unclear description of the core architecture of the paper:\ Unfortunately, the authors describe their architecture in not much detail on roughly half a page (including a Figure). Instead, the author refer the reader to the paper of the DeltaProduct, which forms the core of this architecture. Since the authors present this as a novel model architecture, it would be good to at least have a better description of this critical building block. There appears to be more text on page 3 introducing and d
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
