Time-Aware Prior Fitted Networks for Zero-Shot Forecasting with Exogenous Variables
Andres Potapczynski, Ravi Kiran Selvam, Tatiana Konstantinova, Shankar Ramasubramanian, Malcolm Wolff, Kin G. Olivares, Ruijun Ma, Mengfei Cao, Michael W. Mahoney, Andrew Gordon Wilson, Boris N. Oreshkin, Dmitry Efimov

TL;DR
This paper introduces ApolloPFN, a novel time-aware prior-data fitted network that effectively incorporates exogenous variables for improved zero-shot time series forecasting, outperforming existing models on key benchmarks.
Contribution
The paper presents a new architecture and synthetic data generation method that enable prior-data fitted networks to leverage exogenous covariates in time series forecasting.
Findings
Achieves state-of-the-art results on M5 and electric price benchmarks.
Effectively incorporates exogenous variables into zero-shot forecasting.
Outperforms existing models that ignore exogenous signals.
Abstract
In many time series forecasting settings, the target time series is accompanied by exogenous covariates, such as promotions and prices in retail demand; temperature in energy load; calendar and holiday indicators for traffic or sales; and grid load or fuel costs in electricity pricing. Ignoring these exogenous signals can substantially degrade forecasting accuracy, particularly when they drive spikes, discontinuities, or regime and phase changes in the target series. Most current time series foundation models (e.g., Chronos, Sundial, TimesFM, TimeMoE, TimeLLM, and LagLlama) ignore exogenous covariates and make forecasts solely from the numerical time series history, thereby limiting their performance. In this paper, we develop ApolloPFN, a prior-data fitted network (PFN) that is time-aware (unlike prior PFNs) and that natively incorporates exogenous covariates (unlike prior univariate…
Peer Reviews
Decision·Submitted to ICLR 2026
Adapting the PFN paradigm to time series with native exogenous handling and explicit time-aware inductive bias is a meaningful advance. The critique of using order-invariant tabular FMs for TS is convincing (forecasting requires order sensitivity), and the synthetic-prior design tailored to TS is a good fit for PFN training. Given the practical importance of zero-shot + exogenous, the contribution is significant. * The paper clearly articulates failure modes of TabPFN-TS (order-invariance, weak
I believe the paper has a limited analysis of probabilistic calibration and robustness to exogenous shift, which can be needed in real-life time-series settings. I also believe that the heavy reliance on synthetic priors deserves more discussion on alignment to real exogenous processes. I agree with the scaling constraints from quadratic attention as noted by the authors, which is a considerable weakness.
1. Paper is clear and well presented. 2. Framework reforms well across the two datasets considered by the authors (M5 dataset and electricity price forecasting). 3. Paper presents an algorithm that improves PFM architectures for exogenous features.
1. While the framework is established for multivariate datasets, the comparison with just two datasets seems limited. Other multivariate datasets like wind power forecasting etc can be used to understand the performance of this algorithm further. 2. Benchmark models can be improved by considering TTM and/or Flowstate which also works with exogenous features and is in the top 10 of Gift-Eval dataset.
1. The proposed temporal SCM generation procedure and the SNGN algorithm are well-motivated and supported by empirical evidence. The design effectively introduces temporal dependencies into the synthetic data, which allows the model to learn meaningful temporal structures during training. This approach represents a thoughtful and principled way to bridge the gap between tabular PFNs and time-series forecasting tasks.
1. Many of the “failure mode” demonstrations (e.g., Fig. 2) rely on single illustrative examples rather than aggregated or statistically supported analyses. Without quantitative evidence across a larger number of series or benchmarks, it is difficult to assess whether these issues with TabPFN-TS are systematic or merely anecdotal, which somewhat weakens the empirical foundation of the argument. 2. Most datasets containing exogenous covariates (such as M5 and electricity price forecasting) are r
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Forecasting Techniques and Applications · Stock Market Forecasting Methods
