BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models
Maozhen Zhang, Mengnan Zhao, Wei Wang, Bo Wang

TL;DR
This paper introduces BadPromptFL, a backdoor attack on prompt-based federated learning in multimodal models, demonstrating high success rates and stealthiness, thereby exposing security vulnerabilities in privacy-preserving collaborative AI systems.
Contribution
It is the first to reveal backdoor vulnerabilities in prompt-based federated learning for multimodal models, highlighting a new attack vector in this emerging domain.
Findings
Achieves over 90% attack success rate
Maintains high stealth and minimal client participation
Effective across multiple datasets and protocols
Abstract
Prompt-based tuning has emerged as a lightweight alternative to full fine-tuning in large vision-language models, enabling efficient adaptation via learned contextual prompts. This paradigm has recently been extended to federated learning settings (e.g., PromptFL), where clients collaboratively train prompts under data privacy constraints. However, the security implications of prompt-based aggregation in federated multimodal learning remain largely unexplored, leaving a critical attack surface unaddressed. In this paper, we introduce \textbf{BadPromptFL}, the first backdoor attack targeting prompt-based federated learning in multimodal contrastive models. In BadPromptFL, compromised clients jointly optimize local backdoor triggers and prompt embeddings, injecting poisoned prompts into the global aggregation process. These prompts are then propagated to benign clients, enabling universal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Access Control and Trust · Topic Modeling
