TL;DR
This paper introduces ProjRes, a novel projection residuals-based membership inference attack that effectively exposes privacy vulnerabilities in federated large language models, outperforming existing methods.
Contribution
The paper presents ProjRes, the first efficient passive MIA tailored for FedLLMs that does not require shadow models or auxiliary data, revealing significant privacy risks.
Findings
ProjRes achieves near 100% attack accuracy.
It outperforms prior methods by up to 75.75%.
Remains effective under strong differential privacy defenses.
Abstract
Federated Large Language Models (FedLLMs) enable multiple parties to collaboratively fine-tune LLMs without sharing raw data, addressing challenges of limited resources and privacy concerns. Despite data localization, shared gradients can still expose sensitive information through membership inference attacks (MIAs). However, FedLLMs' unique properties, i.e. massive parameter scales, rapid convergence, and sparse, non-orthogonal gradients, render existing MIAs ineffective. To address this gap, we propose ProjRes, the first projection residuals-based passive MIA tailored for FedLLMs. ProjRes leverages hidden embedding vectors as sample representations and analyzes their projection residuals on the gradient subspace to uncover the intrinsic link between gradients and inputs. It requires no shadow models, auxiliary classifiers, or historical updates, ensuring efficiency and robustness.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
