On the Privacy of LLMs: An Ablation Study
Karima Makhlouf, Lamiaa Basyoni, Syed Khaderi, Gabriel Marquez, Peter Sotomango, Mahmoud Awawdah, Sami Zhioua

TL;DR
This study systematically evaluates privacy risks in large language models by analyzing various attack types under different system factors, revealing that privacy vulnerabilities are highly context-dependent.
Contribution
It introduces a unified threat model, reproduces key privacy attacks, and conducts a structured ablation study to understand how system factors influence attack success.
Findings
Membership inference attacks are highly effective, especially mask-based variants.
Backdoor attacks consistently achieve high success rates due to triggers.
Attribute inference and data extraction are more challenging but still pose risks.
Abstract
Large language models (LLMs) are increasingly deployed in interactive and retrieval-augmented settings, raising significant privacy concerns. While attacks such as Membership Inference (MIA), Attribute Inference (AIA), Data Extraction (DEA), and Backdoor Attacks (BA) have been studied, they are typically analyzed in isolation, leaving a gap in understanding their behavior under common system factors. In this paper, we introduce a unified threat model and notation, reproduce a representative set of privacy attacks, and conduct a structured ablation study to evaluate the impact of key factors such as model architecture, scale, dataset characteristics, and retrieval configuration. Our analysis reveals clear differences across attack types. Membership inference attacks, particularly mask-based variants, exhibit strong and reliable signals, while backdoor attacks achieve consistently high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
