Reconstructing Training Data from Adapter-based Federated Large Language Models
Silong Chen, Yuchuan Luo, Guilin Deng, Yi Liu, Min Xu, Shaojing Fu, Xiaohua Jia

TL;DR
This paper introduces a novel gradient inversion attack called UTR that can effectively reconstruct training data from adapter-based federated large language models, revealing privacy vulnerabilities despite their lightweight adaptation.
Contribution
The paper presents UTR, a new GIA tailored for adapter-based FedLLMs, demonstrating its effectiveness in reconstructing training data and challenging assumptions about privacy in lightweight models.
Findings
UTR achieves over 99% ROUGE scores in data reconstruction.
Adapter-based FedLLMs are vulnerable to privacy leaks despite low-rank adapters.
Prior GIAs fail under large batch settings, but UTR remains effective.
Abstract
Adapter-based Federated Large Language Models (FedLLMs) are widely adopted to reduce the computational, storage, and communication overhead of full-parameter fine-tuning for web-scale applications while preserving user privacy. By freezing the backbone and training only compact low-rank adapters, these methods appear to limit gradient leakage and thwart existing Gradient Inversion Attacks (GIAs). Contrary to this assumption, we show that low-rank adapters create new, exploitable leakage channels. We propose the Unordered-word-bag-based Text Reconstruction (UTR) attack, a novel GIA tailored to the unique structure of adapter-based FedLLMs. UTR overcomes three core challenges: low-dimensional gradients, frozen backbones, and combinatorially large reconstruction spaces by: (i) inferring token presence from attention patterns in frozen layers, (ii) performing sentence-level inversion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Topic Modeling
