When Facts Change: Probing LLMs on Evolving Knowledge with evolveQA

Nishanth Sridhar Nakshatri; Shamik Roy; Manoj Ghuhan Arivazhagan; Hanhan Zhou; Vinayshekhar Bannihatti Kumar; Rashmi Gangadharaiah

arXiv:2510.19172·cs.CL·November 18, 2025

When Facts Change: Probing LLMs on Evolving Knowledge with evolveQA

Nishanth Sridhar Nakshatri, Shamik Roy, Manoj Ghuhan Arivazhagan, Hanhan Zhou, Vinayshekhar Bannihatti Kumar, Rashmi Gangadharaiah

PDF

Open Access

TL;DR

This paper introduces evolveQA, a benchmark for evaluating how well large language models handle changing facts over time, revealing significant performance drops on evolving knowledge questions.

Contribution

The paper presents evolveQA, a novel benchmark constructed from real-world, time-stamped data to assess LLMs' ability to adapt to evolving knowledge over time.

Findings

01

LLMs show up to 31% performance decline on evolveQA.

02

Existing models struggle with temporally evolving information.

03

evolveQA effectively highlights knowledge update limitations.

Abstract

LLMs often fail to handle temporal knowledge conflicts--contradictions arising when facts evolve over time within their training data. Existing studies evaluate this phenomenon through benchmarks built on structured knowledge bases like Wikidata, but they focus on widely-covered, easily-memorized popular entities and lack the dynamic structure needed to fairly evaluate LLMs with different knowledge cut-off dates. We introduce evolveQA, a benchmark specifically designed to evaluate LLMs on temporally evolving knowledge, constructed from 3 real-world, time-stamped corpora: AWS updates, Azure changes, and WHO disease outbreak reports. Our framework identifies naturally occurring knowledge evolution and generates questions with gold answers tailored to different LLM knowledge cut-off dates. Through extensive evaluation of 12 open and closed-source LLMs across 3 knowledge probing formats, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Wikis in Education and Collaboration · Advanced Graph Neural Networks