BadVim: Unveiling Backdoor Threats in Visual State Space Model
Cheng-Yi Lee, Yu-Hsuan Chiang, Zhong-You Wu, Chia-Mu Yu, Chun-Shien Lu

TL;DR
This paper introduces BadVim, a novel backdoor attack framework on Visual State Space Models that leverages low-rank perturbations to cause targeted misclassification with high success rates, even with minimal training data poisoning.
Contribution
The paper presents BadVim, a new backdoor attack method on VSSMs using low-rank perturbations, demonstrating high attack success and robustness against defenses across multiple datasets.
Findings
Backdoor attacks achieve over 97% success rate with only 0.3% poisoned data.
VSSMs are vulnerable to backdoor attacks comparable to Transformers.
The attack bypasses state-of-the-art defenses and outperforms CNNs in robustness.
Abstract
Visual State Space Models (VSSM) have shown remarkable performance in various computer vision tasks. However, backdoor attacks pose significant security challenges, causing compromised models to predict target labels when specific triggers are present while maintaining normal behavior on benign samples. In this paper, we investigate the robustness of VSSMs against backdoor attacks. Specifically, we delicately design a novel framework for VSSMs, dubbed BadVim, which utilizes low-rank perturbations on state-wise to uncover their impact on state transitions during training. By poisoning only of the training data, our attacks cause any trigger-embedded input to be misclassified to the targeted class with a high attack success rate (over 97%) at inference time. Our findings suggest that the state-space representation property of VSSMs, which enhances model capability, may also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Digital Media Forensic Detection · Chaos-based Image/Signal Encryption
