Factual Knowledge in Language Models: Robustness and Anomalies under Simple Temporal Context Variations
Hichem Ammar Khodja, Fr\'ed\'eric B\'echet, Quentin Brabant, Alexis Nasr, Gw\'enol\'e Lecorv\'e

TL;DR
This paper investigates how well language models understand and differentiate factual information across different temporal contexts, revealing significant limitations in their ability to accurately associate facts with correct time periods.
Contribution
It introduces the TimeStress dataset for evaluating temporal robustness in LMs and provides a comprehensive analysis of their limitations in temporal factual reasoning.
Findings
Best LM achieves perfect distinction for only 11% of facts
Current LMs struggle with temporal context differentiation
Errors made by LMs are rare but critical
Abstract
This paper explores the robustness of language models (LMs) to variations in the temporal context within factual knowledge. It examines whether LMs can correctly associate a temporal context with a past fact valid over a defined period, by asking them to differentiate correct from incorrect contexts. The LMs' ability to distinguish is analyzed along two dimensions: the distance of the incorrect context from the validity period and the granularity of the context. To this end, a dataset called TimeStress is introduced, enabling the evaluation of 18 diverse LMs. Results reveal that the best LM achieves a perfect distinction for only 11% of the studied facts, with errors, certainly rare, but critical that humans would not make. This work highlights the limitations of current LMs in temporal representation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
