Forecasting Scientific Progress with Artificial Intelligence
Sean Wu, Pan Lu, Yupeng Chen, Jonathan Bragg, Yutaro Yamada, Peter Clark, David Clifton, Philip Torr, James Zou, Junchi Yu

TL;DR
This paper introduces a new benchmark and evaluation framework to assess AI's ability to forecast scientific progress, revealing current models' limitations in predicting the timing and realization of scientific advances across disciplines.
Contribution
The paper presents CUSP, a comprehensive benchmark for scientific forecasting, and provides an empirical analysis of AI models' capabilities and shortcomings in predicting scientific progress.
Findings
AI models can identify plausible research directions but struggle to predict if and when advances will occur.
Performance varies across scientific domains, with better timing predictions in AI compared to biology, chemistry, and physics.
Models exhibit overconfidence and response biases, indicating unreliable uncertainty estimates.
Abstract
Artificial intelligence (AI) is increasingly embedded in scientific discovery, yet whether it can anticipate scientific progress remains unclear. To study this question, we introduce a temporally grounded evaluation framework for forecasting scientific progress under controlled knowledge constraints. We present CUSP (Cutoff-conditioned Unseen Scientific Progress), a multi-disciplinary and event-level benchmark that evaluates scientific forecasting in AI systems through feasibility assessment, mechanistic reasoning, generative solution design, and temporal prediction. Across 4,760 scientific events, we observe systematic and domain-dependent limitations in current frontier models. While models can identify plausible research directions from competing candidates, they fail to reliably predict whether scientific advances will be realized and systematically misestimate when they will occur.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
