Can Fine-Tuning Erase Your Edits? On the Fragile Coexistence of Knowledge Editing and Adaptation
Yinjie Cheng, Paul Youssef, Christin Seifert, J\"org Schl\"otterer, Zhixue Zhao

TL;DR
This paper investigates how fine-tuning affects knowledge edits in large language models, revealing that edits often decay or can be selectively removed, impacting model safety and maintenance strategies.
Contribution
It provides the first systematic analysis of edit survival after fine-tuning, comparing different editing methods and fine-tuning strategies, and offers practical guidelines for model editing and adaptation.
Findings
Edits decay after fine-tuning, with survival rates varying by method.
Fine-tuning only edited layers can remove edits with minimal performance loss.
Fine-tuning non-edited layers can impair more edits than full fine-tuning.
Abstract
Knowledge editing has emerged as a lightweight alternative to retraining for correcting or injecting specific facts in large language models (LLMs). Meanwhile, fine-tuning remains the default operation for adapting LLMs to new domains and tasks. Despite their widespread adoption, these two post-training interventions have been studied in isolation, leaving open a crucial question: if we fine-tune an edited model, do the edits survive? This question is motivated by two practical scenarios: removing covert or malicious edits, and preserving beneficial edits. If fine-tuning impairs edits (Fig.1), current KE methods become less useful, as every fine-tuned model would require re-editing, which significantly increases the cost; if edits persist, fine-tuned models risk propagating hidden malicious edits, raising serious safety concerns. To this end, we systematically quantify edit decay after…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Scientific Computing and Data Management · Advanced Graph Neural Networks
