ExAnte: A Benchmark for Ex-Ante Inference in Large Language Models

Yachuan Liu; Xiaochun Wei; Lin Shi; Xinnuo Li; Bohan Zhang; Paramveer Dhillon; Qiaozhu Mei

arXiv:2505.19533·cs.LG·May 27, 2025

ExAnte: A Benchmark for Ex-Ante Inference in Large Language Models

Yachuan Liu, Xiaochun Wei, Lin Shi, Xinnuo Li, Bohan Zhang, Paramveer Dhillon, Qiaozhu Mei

PDF

Open Access 1 Repo 1 Video

TL;DR

ExAnte introduces a benchmark to evaluate large language models' ability to perform reasoning and predictions within specified temporal constraints, revealing current models' struggles with ex-ante reasoning across various tasks.

Contribution

The paper presents a new benchmark and task suite specifically designed to assess LLMs' ex-ante reasoning capabilities under temporal constraints, highlighting existing limitations.

Findings

01

LLMs often rely on future information beyond temporal cutoffs.

02

Models struggle to adhere to temporal constraints with common prompting methods.

03

The benchmark offers a framework for improving time-sensitive reasoning in LLMs.

Abstract

Large language models (LLMs) face significant challenges in ex-ante reasoning, where analysis, inference, or predictions must be made without access to information from future events. Even with explicit prompts enforcing temporal cutoffs, LLMs often generate outputs influenced by internalized knowledge of events beyond the specified cutoff. This paper introduces a novel task and benchmark designed to evaluate the ability of LLMs to reason while adhering to such temporal constraints. The benchmark includes a variety of tasks: stock prediction, Wikipedia event prediction, scientific publication prediction, and Question Answering (QA), designed to assess factual knowledge under temporal cutoff constraints. We use leakage rate to quantify models' reliance on future information beyond cutoff timestamps. Experimental results reveal that LLMs struggle to consistently adhere to temporal cutoffs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yachuan/exante
noneOfficial

Videos

ExAnte: A Benchmark for Ex-Ante Inference in Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis