Frozen Forecasting: A Unified Evaluation

Jacob C Walker; Pedro V\'elez; Luisa Polania Cabrera; Guangyao Zhou; Sayna Ebrahimi; Rishabh Kabra; Carl Doersch; Maks Ovsjanikov; Jo\~ao Carreira; Shiry Ginosar

arXiv:2507.13942·cs.CV·April 16, 2026

Frozen Forecasting: A Unified Evaluation

Jacob C Walker, Pedro V\'elez, Luisa Polania Cabrera, Guangyao Zhou, Sayna Ebrahimi, Rishabh Kabra, Carl Doersch, Maks Ovsjanikov, Jo\~ao Carreira, Shiry Ginosar

PDF

TL;DR

This paper introduces a unified framework for evaluating the forecasting capabilities of frozen vision models across diverse tasks and abstraction levels, using trajectory-based and distributional metrics.

Contribution

It proposes a novel evaluation method that assesses entire trajectories in the model's feature space, enabling consistent comparison across various vision models and tasks.

Findings

01

Forecasting performance correlates with perceptual quality.

02

Video synthesis models outperform image-based models in forecasting.

03

Language supervision does not consistently enhance forecasting abilities.

Abstract

Forecasting future events is a fundamental capability for general-purpose systems that plan or act across different levels of abstraction. Yet, evaluating whether a forecast is "correct" remains challenging due to the inherent uncertainty of the future. We propose a unified evaluation framework for assessing the forecasting capabilities of frozen vision backbones across diverse tasks and abstraction levels. Rather than focusing on single time steps, our framework evaluates entire trajectories and incorporates distributional metrics that better capture the multimodal nature of future outcomes. Given a frozen vision model, we train latent diffusion models to forecast future features directly in its representation space, which are then decoded via lightweight, task-specific readouts. This enables consistent evaluation across a suite of diverse tasks while isolating the forecasting capacity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.