LoRAShield: Data-Free Editing Alignment for Secure Personalized LoRA Sharing
Jiahao Chen, junhao li, Yiming Wang, Zhe Ma, Yi Jiang, Chunyi Zhou, Qingming Li, Tianyu Du, Shouling Ji

TL;DR
LoRAShield is a novel data-free framework that dynamically edits LoRA models to prevent malicious content generation, enhancing security and trustworthiness in personalized text-to-image generation sharing platforms.
Contribution
It introduces the first data-free editing method for LoRA models, addressing their unique vulnerabilities and enabling secure sharing without compromising benign functionalities.
Findings
Effectively blocks malicious content generation
Maintains model performance on benign tasks
Operates efficiently without additional data
Abstract
The proliferation of Low-Rank Adaptation (LoRA) models has democratized personalized text-to-image generation, enabling users to share lightweight models (e.g., personal portraits) on platforms like Civitai and Liblib. However, this "share-and-play" ecosystem introduces critical risks: benign LoRAs can be weaponized by adversaries to generate harmful content (e.g., political, defamatory imagery), undermining creator rights and platform safety. Existing defenses like concept-erasure methods focus on full diffusion models (DMs), neglecting LoRA's unique role as a modular adapter and its vulnerability to adversarial prompt engineering. To bridge this gap, we propose LoRAShield, the first data-free editing framework for securing LoRA models against misuse. Our platform-driven approach dynamically edits and realigns LoRA's weight subspace via adversarial optimization and semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Generative Adversarial Networks and Image Synthesis
