A Test of Lookahead Bias in LLM Forecasts

Zhenyu Gao; Wenxi Jiang; Yutong Yan

arXiv:2512.23847·q-fin.GN·January 1, 2026

A Test of Lookahead Bias in LLM Forecasts

Zhenyu Gao, Wenxi Jiang, Yutong Yan

PDF

Open Access

TL;DR

This paper introduces a statistical test to detect lookahead bias in LLM-generated economic forecasts by estimating the likelihood of prompts appearing in training data and analyzing its correlation with forecast accuracy.

Contribution

It develops a novel diagnostic method using Lookahead Propensity to identify and measure lookahead bias in large language model forecasts.

Findings

01

Positive correlation indicates presence of lookahead bias

02

Test applied successfully to stock return and capital expenditure forecasts

03

Provides a cost-effective tool for forecast validation

Abstract

We develop a statistical test to detect lookahead bias in economic forecasts generated by large language models (LLMs). Using state-of-the-art pre-training data detection techniques, we estimate the likelihood that a given prompt appeared in an LLM's training corpus, a statistic we term Lookahead Propensity (LAP). We formally show that a positive correlation between LAP and forecast accuracy indicates the presence and magnitude of lookahead bias, and apply the test to two forecasting tasks: news headlines predicting stock returns and earnings call transcripts predicting capital expenditures. Our test provides a cost-efficient, diagnostic tool for assessing the validity and reliability of LLM-generated forecasts.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStock Market Forecasting Methods · Auditing, Earnings Management, Governance · Financial Markets and Investment Strategies