Temporal Fact Conflicts in LLMs: Reproducibility Insights from Unifying DYNAMICQA and MULAN

Ritajit Dey; Iadh Ounis; Graham McDonald; Yashar Moshfeghi

arXiv:2603.15892·cs.IR·March 18, 2026

Temporal Fact Conflicts in LLMs: Reproducibility Insights from Unifying DYNAMICQA and MULAN

Ritajit Dey, Iadh Ounis, Graham McDonald, Yashar Moshfeghi

PDF

Open Access

TL;DR

This study compares two datasets on how external context affects temporal fact conflicts in LLMs, revealing dataset dependence, the influence of model size, and the importance of evaluation design.

Contribution

It reproduces and compares experiments from DYNAMICQA and MULAN, analyzing dataset effects and model size influence on temporal fact updating in LLMs.

Findings

01

MULAN's findings generalize across frameworks

02

Dataset design significantly impacts results

03

Model size affects temporal fact encoding and updating

Abstract

Large Language Models (LLMs) often struggle with temporal fact conflicts due to outdated or evolving information in their training data. Two recent studies with accompanying datasets report opposite conclusions on whether external context can effectively resolve such conflicts. DYNAMICQA evaluates how effective external context is in shifting the model's output distribution, finding that temporal facts are more resistant to change. In contrast, MULAN examines how often external context changes memorised facts, concluding that temporal facts are easier to update. In this reproducibility paper, we first reproduce experiments from both benchmarks. We then reproduce the experiments of each study on the dataset of the other to investigate the source of their disagreement. To enable direct comparison of findings, we standardise both datasets to align with the evaluation settings of each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Language and cultural evolution · Computational and Text Analysis Methods