Simulated Ignorance Fails: A Systematic Study of LLM Behaviors on Forecasting Problems Before Model Knowledge Cutoff

Zehan Li; Yuxuan Wang; Ali El Lahib; Ying-Jieh Xia; Xinyu Pi

arXiv:2601.13717·cs.CL·January 21, 2026

Simulated Ignorance Fails: A Systematic Study of LLM Behaviors on Forecasting Problems Before Model Knowledge Cutoff

Zehan Li, Yuxuan Wang, Ali El Lahib, Ying-Jieh Xia, Xinyu Pi

PDF

Open Access

TL;DR

This study systematically evaluates whether simulated ignorance prompts can make large language models genuinely unaware of recent knowledge, revealing significant limitations and questioning the validity of retrospective forecasting benchmarks.

Contribution

First comprehensive analysis showing that simulated ignorance prompts do not reliably approximate true ignorance in large language models, highlighting methodological issues in forecasting evaluations.

Findings

01

SI leaves a 52% performance gap compared to TI

02

Chain-of-thought reasoning does not suppress prior knowledge

03

Models with better reasoning traces perform worse in SI fidelity

Abstract

Evaluating LLM forecasting capabilities is constrained by a fundamental tension: prospective evaluation offers methodological rigor but prohibitive latency, while retrospective forecasting (RF) -- evaluating on already-resolved events -- faces rapidly shrinking clean evaluation data as SOTA models possess increasingly recent knowledge cutoffs. Simulated Ignorance (SI), prompting models to suppress pre-cutoff knowledge, has emerged as a potential solution. We provide the first systematic test of whether SI can approximate True Ignorance (TI). Across 477 competition-level questions and 9 models, we find that SI fails systematically: (1) cutoff instructions leave a 52% performance gap between SI and TI; (2) chain-of-thought reasoning fails to suppress prior knowledge, even when reasoning traces contain no explicit post-cutoff references; (3) reasoning-optimized models exhibit worse SI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForecasting Techniques and Applications · Explainable Artificial Intelligence (XAI) · Topic Modeling