LLMLagBench: Identifying Temporal Training Boundaries in Large Language Models

Piotr P\k{e}zik; Konrad Kaczy\'nski; Maria Szyma\'nska; Filip \.Zarnecki; Zuzanna Deckert; Jakub Kwiatkowski; Wojciech Janowski

arXiv:2511.12116·cs.CL·November 18, 2025

LLMLagBench: Identifying Temporal Training Boundaries in Large Language Models

Piotr P\k{e}zik, Konrad Kaczy\'nski, Maria Szyma\'nska, Filip \.Zarnecki, Zuzanna Deckert, Jakub Kwiatkowski, Wojciech Janowski

PDF

Open Access

TL;DR

LLMLagBench is a benchmark designed to identify the temporal knowledge boundaries of large language models by evaluating their awareness of recent events, helping to understand their knowledge freshness and limitations.

Contribution

The paper introduces LLMLagBench, a systematic benchmark for detecting the earliest temporal training boundaries of LLMs, and evaluates various models' knowledge freshness.

Findings

01

LLMLagBench effectively identifies LLM knowledge cutoffs.

02

Models with declared cutoffs show more accurate boundary detection.

03

Benchmark results reveal varying knowledge freshness across models.

Abstract

Large Language Models (LLMs) are pretrained on textual data up to a specific temporal cutoff. This creates a strict knowledge boundary beyond which models cannot provide accurate information without querying external sources. More subtly, when this limitation is unknown or ignored, LLMs may inadvertently blend outdated time-sensitive information with general knowledge during reasoning tasks, potentially compromising response accuracy. We introduce LLMLagBench, an LLM freshness benchmark, as a systematic approach for identifying the earliest probable temporal boundaries of an LLM's training data by evaluating its knowledge of recent events. We then apply this benchmark to evaluate a large set of LLMs, including models with both explicitly declared and undeclared training cutoffs. The reliability of the benchmark is assessed by manual validation and comparison with publicly released…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Artificial Intelligence in Healthcare and Education