Tracing Facts or just Copies? A critical investigation of the Competitions of Mechanisms in Large Language Models

Dante Campregher; Yanxu Chen; Sander Hoffman; Maria Heuss

arXiv:2507.11809·cs.CL·July 17, 2025

Tracing Facts or just Copies? A critical investigation of the Competitions of Mechanisms in Large Language Models

Dante Campregher, Yanxu Chen, Sander Hoffman, Maria Heuss

PDF

Open Access

TL;DR

This study critically examines how large language models handle conflicting factual information, revealing that attention heads promote facts through general suppression mechanisms and exhibit domain-dependent behaviors, challenging prior assumptions.

Contribution

It provides a detailed mechanistic analysis of attention heads in LLMs, clarifying their role in factual competition and domain specificity, and reconciles conflicting prior findings.

Findings

01

Attention heads promote facts via general copy suppression.

02

Attention head behavior varies across domains.

03

Larger models show more specialized attention patterns.

Abstract

This paper presents a reproducibility study examining how Large Language Models (LLMs) manage competing factual and counterfactual information, focusing on the role of attention heads in this process. We attempt to reproduce and reconcile findings from three recent studies by Ortu et al., Yu, Merullo, and Pavlick and McDougall et al. that investigate the competition between model-learned facts and contradictory context information through Mechanistic Interpretability tools. Our study specifically examines the relationship between attention head strength and factual output ratios, evaluates competing hypotheses about attention heads' suppression mechanisms, and investigates the domain specificity of these attention patterns. Our findings suggest that attention heads promoting factual output do so via general copy suppression rather than selective counterfactual suppression, as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling