Stealthy and Adjustable Text-Guided Backdoor Attacks on Multimodal Pretrained Models
Yiyang Zhang, Chaojian Yu, Ziming Hong, Yuanjie Shao, Qinmu Peng, Tongliang Liu, and Xinge You

TL;DR
This paper introduces a novel text-guided backdoor attack on multimodal models that uses common words as triggers, enhancing stealthiness and allowing adjustable attack success rates.
Contribution
It proposes a new, practical backdoor attack method that employs textual triggers and visual perturbations, revealing security vulnerabilities in multimodal pretrained models.
Findings
TGB achieves high attack success rates in real-world scenarios.
The attack method is highly stealthy and adaptable.
Experiments demonstrate vulnerabilities in models like CIR and VQA.
Abstract
Multimodal pretrained models are vulnerable to backdoor attacks, yet most existing methods rely on visual or multimodal triggers, which are impractical since visually embedded triggers rarely occur in real-world data. To overcome this limitation, we propose a novel Text-Guided Backdoor (TGB) attack on multimodal pretrained models, where commonly occurring words in textual descriptions serve as backdoor triggers, significantly improving stealthiness and practicality. Furthermore, we introduce visual adversarial perturbations on poisoned samples to modulate the model's learning of textual triggers, enabling a controllable and adjustable TGB attack. Extensive experiments on downstream tasks built upon multimodal pretrained models, including Composed Image Retrieval (CIR) and Visual Question Answering (VQA), demonstrate that TGB achieves practicality and stealthiness with adjustable attack…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
