Large Language Model Unlearning for Source Code

Xue Jiang; Yihong Dong; Huangzhao Zhang; Tangxinyu Wang; Zheng Fang; Yingwei Ma; Rongyu Cao; Binhua Li; Zhi Jin; Wenpin Jiao; Yongbin Li; Ge Li

arXiv:2506.17125·cs.SE·November 25, 2025

Large Language Model Unlearning for Source Code

Xue Jiang, Yihong Dong, Huangzhao Zhang, Tangxinyu Wang, Zheng Fang, Yingwei Ma, Rongyu Cao, Binhua Li, Zhi Jin, Wenpin Jiao, Yongbin Li, Ge Li

PDF

Open Access

TL;DR

This paper introduces PROD, a precise unlearning method for source code in large language models, which effectively erases specific knowledge without degrading overall performance, and establishes a benchmark for evaluation.

Contribution

The paper proposes a novel unlearning technique called PROD for source code, along with a benchmark and metric to evaluate unlearning effectiveness in LLMs.

Findings

01

PROD outperforms existing methods in forget quality and utility.

02

It maintains model performance while precisely erasing targeted code snippets.

03

PROD shows robustness against adversarial attacks.

Abstract

While Large Language Models (LLMs) excel at code generation, their inherent tendency toward verbatim memorization of training data introduces critical risks like copyright infringement, insecure emission, and deprecated API utilization, etc. A straightforward yet promising defense is unlearning, ie., erasing or down-weighting the offending snippets through post-training. However, we find its application to source code often tends to spill over, damaging the basic knowledge of programming languages learned by the LLM and degrading the overall capability. To ease this challenge, we propose PROD for precise source code unlearning. PROD surgically zeroes out the prediction probability of the prohibited tokens, and renormalizes the remaining distribution so that the generated code stays correct. By excising only the targeted snippets, PROD achieves precise forgetting without much degradation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Security and Verification in Computing