How Quantization Impacts Privacy Risk on LLMs for Code?

Md Nazmul Haque; Hua Yang; Zhou Yang; Bowen Xu

arXiv:2508.00128·cs.SE·August 4, 2025

How Quantization Impacts Privacy Risk on LLMs for Code?

Md Nazmul Haque, Hua Yang, Zhou Yang, Bowen Xu

PDF

Open Access

TL;DR

This study empirically examines how quantization affects privacy risks and task performance in large language models for code, revealing that quantization can reduce privacy risks and that a tradeoff exists between performance and privacy.

Contribution

First empirical analysis of the impact of quantization on privacy risk and task performance in LLMs for code across multiple architectures and sizes.

Findings

01

Quantization reduces privacy risks compared to full-precision models.

02

A positive correlation exists between task performance and privacy risk.

03

Quantizing larger models can offer a better privacy-performance balance.

Abstract

Large language models for code (LLMs4Code) rely heavily on massive training data, including sensitive data, such as cloud service credentials of the projects and personal identifiable information of the developers, raising serious privacy concerns. Membership inference (MI) has recently emerged as an effective tool for assessing privacy risk by identifying whether specific data belong to a model's training set. In parallel, model compression techniques, especially quantization, have gained traction for reducing computational costs and enabling the deployment of large models. However, while quantized models still retain knowledge learned from the original training data, it remains unclear whether quantization affects their ability to retain and expose privacy information. Answering this question is of great importance to understanding privacy risks in real-world deployments. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Scientific Computing and Data Management