Mitigating Sensitive Information Leakage in LLMs4Code through Machine Unlearning
Shanzhi Gu, Zhaoyang Qu, Ruotong Geng, Mingyang Geng, Shangwen Wang, Chuanfu Xu, Haotian Wang, Zhipeng Lin, Dezun Dong

TL;DR
This paper empirically evaluates machine unlearning techniques to reduce sensitive information leakage in Large Language Models for Code, demonstrating significant privacy improvements while maintaining high code-generation performance.
Contribution
It provides the first comprehensive benchmark and analysis of unlearning algorithms applied to LLMs4Code for privacy preservation.
Findings
Direct leak rate drops by over 50% after unlearning.
Over 91% of original code-generation capability is retained.
Unlearning shifts leakage from direct to indirect, revealing new challenges.
Abstract
Large Language Models for Code (LLMs4Code) have achieved strong performance in code generation, but recent studies reveal that they may memorize and leak sensitive information contained in training data, posing serious privacy risks. To address this gap, this work presents the first comprehensive empirical study on applying machine unlearning to mitigate sensitive information leakage in LLMs4Code. We first construct a dedicated benchmark that includes: (i) a synthetic forget set containing diverse forms of personal information, and (ii) a retain set designed to evaluate whether code-generation capability is preserved after unlearning. Using this benchmark, we systematically assess three representative unlearning algorithms (GA, GA+GD, GA+KL) across three widely used open-source LLMs4Code models (AIXCoder-7B, CodeLlama-7B, CodeQwen-7B). Experimental results demonstrate that machine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
