Does Low Rank Adaptation Lead to Lower Robustness against Training-Time Attacks?
Zi Liang, Haibo Hu, Qingqing Ye, Yaxin Xiao, Ronghua Li

TL;DR
This paper analyzes how Low Rank Adaptation (LoRA) impacts the robustness of large language models against training-time attacks, revealing it offers better defense against backdoors but increased vulnerability to data poisoning.
Contribution
It provides a theoretical framework linking LoRA's low-rank structure to its security properties and validates findings with extensive experiments.
Findings
LoRA is more robust to backdoor attacks than full fine-tuning.
LoRA is more vulnerable to data poisoning due to simplified information geometry.
Theoretical analysis aligns with experimental results.
Abstract
Low rank adaptation (LoRA) has emerged as a prominent technique for fine-tuning large language models (LLMs) thanks to its superb efficiency gains over previous methods. While extensive studies have examined the performance and structural properties of LoRA, its behavior upon training-time attacks remain underexplored, posing significant security risks. In this paper, we theoretically investigate the security implications of LoRA's low-rank structure during fine-tuning, in the context of its robustness against data poisoning and backdoor attacks. We propose an analytical framework that models LoRA's training dynamics, employs the neural tangent kernel to simplify the analysis of the training process, and applies information theory to establish connections between LoRA's low rank structure and its vulnerability against training-time attacks. Our analysis indicates that LoRA exhibits…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Stochastic Gradient Optimization Techniques
