Reforming the Mechanism: Editing Reasoning Patterns in LLMs with Circuit Reshaping
Zhenyu Lei, Qiong Wu, Jianxiong Dong, Yinhan He, Emily Dodwell, Yushun Dong, Jundong Li

TL;DR
This paper introduces REdit, a novel framework for selectively editing reasoning patterns in large language models by reshaping neural circuits, improving targeted reasoning abilities while preserving other skills.
Contribution
It proposes the first method to actively reshape neural circuits to control reasoning pattern interference, balancing generality and locality in reasoning edits.
Findings
REdit outperforms baselines in reasoning generality and locality.
Reshaping neural circuits reduces interference between reasoning patterns.
Method demonstrates effectiveness in propositional logic and mathematics tasks.
Abstract
Large language models (LLMs) often exhibit flawed reasoning ability that undermines reliability. Existing approaches to improving reasoning typically treat it as a general and monolithic skill, applying broad training which is inefficient and unable to target specific reasoning errors. We introduce Reasoning Editing, a paradigm for selectively modifying specific reasoning patterns in LLMs while preserving other reasoning pathways. This task presents a fundamental trade-off between Generality, the ability of an edit to generalize across different tasks sharing the same reasoning pattern, and Locality, the ability to preserve other reasoning capabilities. Through systematic investigation, we uncover the Circuit-Interference Law: Edit interference between reasoning patterns is proportional to the overlap of their neural circuits. Guided by this principle, we propose REdit, the first…
Peer Reviews
Decision·ICLR 2026 Poster
- The presentation of the paper is well-organized and convincing - The perspective of the knowledge circuit and the finding of Circuit-Interference Law is inspiring and novel.
- The work lacks sufficient experimental validation. The current experiments in Sec. 4 are conducted on only one model (Qwen-2.5-3B) and one dataset (ContextHub), which limits the generality of the conclusions. - The choice of base model is inconsistent between the preliminary analysis in Sec. 2.2 and the main experiments in Sec. 4. In addition, the experimental setup in Sec. 3.1 is not clearly described (not sure if this is also based on Qwen-2.5-3B). - The method appears overly complex, but
* The shift from knowledge editing to reasoning editing is original and well-motivated. The generality–locality trade-off is crisply formulated and backed by empirical evidence * The results include confidence intervals, which make them more reliable and indicate the stability of the findings.
* The experiments focus only on propositional logic and structured math tasks, so they don’t fully reflect how reasoning works in more open-ended, real-world settings. It’s still unclear whether the method would hold up beyond these controlled, symbolic cases. * While “circuits” are central, empirical evidence that reshaped circuits correspond to interpretable submodules is limited to correlation plots. There’s no qualitative analysis of what circuits actually represent. * The approach is evalua
- The paper's idea is very interesting and shifts attention from factual knowledge toward the logic applied by models, which is a source of many flaws in their performance. - The Circuit-Interference Law is a significant novelty for the community.
- The datasets and models used are limited. While it does not seem like the conclusions would differ with larger models, the use of the well-controlled ContextHub raises questions about how things might go wrong with wild, real-world data.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Advanced Graph Neural Networks
