Reason-KE++: Aligning the Process, Not Just the Outcome, for Faithful LLM Knowledge Editing
Yuchen Wu, Liang Ding, Li Shen, Dacheng Tao

TL;DR
This paper introduces Reason-KE++, a framework that aligns large language models with new knowledge by focusing on process-level reasoning rather than just outcomes, significantly improving factual faithfulness and reasoning accuracy.
Contribution
It proposes a novel SFT+RL approach with a stage-aware reward mechanism to ensure process-level faithfulness in LLM knowledge editing, addressing the faithfulness gap.
Findings
Achieves 95.48% on MQUAKE-CF-3k, setting new SOTA.
Identifies outcome-only RL as a deceptive trap for LLM alignment.
Demonstrates process-level alignment improves trustworthiness and reasoning accuracy.
Abstract
Aligning Large Language Models (LLMs) to be faithful to new knowledge in complex, multi-hop reasoning tasks is a critical, yet unsolved, challenge. We find that SFT-based methods, e.g., Reason-KE, while state-of-the-art, suffer from a "faithfulness gap": they optimize for format mimicry rather than sound reasoning. This gap enables the LLM's powerful parametric priors to override new contextual facts, resulting in critical factual hallucinations (e.g., incorrectly reasoning "Houston" from "NASA" despite an explicit edit). To solve this core LLM alignment problem, we propose Reason-KE++, an SFT+RL framework that instills process-level faithfulness. Its core is a Stage-aware Reward mechanism that provides dense supervision for intermediate reasoning steps (e.g., Decomposition, Sub-answer Correctness). Crucially, we identify that naive outcome-only RL is a deceptive trap for LLM alignment:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Natural Language Processing Techniques
