That's Deprecated! Understanding, Detecting, and Steering Knowledge Conflicts in Language Models for Code Generation

Jaesung Bae; Cameron Churchwell; Mitchell Hermon; Tsun-An Hsieh; Jocelyn Xu; Yekaterina Yegorova; Mark Hasegawa-Johnson; Heng Ji

arXiv:2510.19116·cs.CL·October 23, 2025

That's Deprecated! Understanding, Detecting, and Steering Knowledge Conflicts in Language Models for Code Generation

Jaesung Bae, Cameron Churchwell, Mitchell Hermon, Tsun-An Hsieh, Jocelyn Xu, Yekaterina Yegorova, Mark Hasegawa-Johnson, Heng Ji

PDF

Open Access

TL;DR

This paper explores how large language models handle conflicting information in code generation, proposing a framework for detection and steering, with experiments showing high detection accuracy and some success in steering model outputs.

Contribution

It introduces a domain-agnostic framework for detecting and interpreting knowledge conflicts in LLMs for code generation, along with a novel evaluation dataset and method.

Findings

01

LLMs encode knowledge conflicts in their parameters.

02

Detection accuracy reaches up to 80.65%.

03

Activation-level steering improves success by 12.6%.

Abstract

This paper investigates how large language models (LLMs) behave when faced with discrepancies between their parametric knowledge and conflicting information contained in a prompt. Building on prior question-answering (QA) research, we extend the investigation of knowledge conflicts to the realm of code generation. We propose a domain-agnostic framework for constructing and interpreting such conflicts, along with a novel evaluation method and dataset tailored to code conflict scenarios. Our experiments indicate that sufficiently large LLMs encode the notion of a knowledge conflict in their parameters, enabling us to detect knowledge conflicts with up to \textbf{80.65\%} accuracy. Building on these insights, we show that activation-level steering can achieve up to a \textbf{12.6\%} improvement in steering success over a random baseline. However, effectiveness depends critically on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks