Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis

Haoming Huang; Yibo Yan; Jiahao Huo; Xin Zou; Xinfeng Li; Kun Wang; Xuming Hu

arXiv:2505.14406·cs.CL·September 10, 2025

Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis

Haoming Huang, Yibo Yan, Jiahao Huo, Xin Zou, Xinfeng Li, Kun Wang, Xuming Hu

PDF

Open Access 1 Video

TL;DR

This paper introduces PhantomCircuit, a framework that analyzes knowledge overshadowing in Large Language Models by dissecting internal attention mechanisms, providing new insights into hallucinations and potential mitigation strategies.

Contribution

The paper presents a novel knowledge circuit analysis framework to understand and detect knowledge overshadowing in LLMs, advancing beyond inference-time observations.

Findings

01

Effective identification of overshadowing instances

02

Insights into attention pattern dynamics during training

03

Potential pathways for hallucination mitigation

Abstract

Large Language Models (LLMs), despite their remarkable capabilities, are hampered by hallucinations. A particularly challenging variant, knowledge overshadowing, occurs when one piece of activated knowledge inadvertently masks another relevant piece, leading to erroneous outputs even with high-quality training data. Current understanding of overshadowing is largely confined to inference-time observations, lacking deep insights into its origins and internal mechanisms during model training. Therefore, we introduce PhantomCircuit, a novel framework designed to comprehensively analyze and detect knowledge overshadowing. By innovatively employing knowledge circuit analysis, PhantomCircuit dissects the function of key components in the circuit and how the attention pattern dynamics contribute to the overshadowing phenomenon and its evolution throughout the training process. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis· underline

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Topic Modeling · Misinformation and Its Impacts

MethodsSoftmax · Attention Is All You Need