TrojViT: Trojan Insertion in Vision Transformers
Mengxin Zheng, Qian Lou, Lei Jiang

TL;DR
TrojViT introduces a novel, stealthy backdoor attack on Vision Transformers by flipping a few vulnerable bits, enabling targeted misclassification with minimal impact on normal accuracy, demonstrated across multiple datasets.
Contribution
The paper presents TrojViT, a ViT-specific backdoor attack that uses patch-wise triggers and minimal bit flips to insert a Trojan, which is more effective and stealthier than CNN-based methods.
Findings
Achieves 99.64% targeted attack success rate on ImageNet
Flips only 345 bits to insert the Trojan
Maintains high accuracy on benign inputs
Abstract
Vision Transformers (ViTs) have demonstrated the state-of-the-art performance in various vision-related tasks. The success of ViTs motivates adversaries to perform backdoor attacks on ViTs. Although the vulnerability of traditional CNNs to backdoor attacks is well-known, backdoor attacks on ViTs are seldom-studied. Compared to CNNs capturing pixel-wise local features by convolutions, ViTs extract global context information through patches and attentions. Na\"ively transplanting CNN-specific backdoor attacks to ViTs yields only a low clean data accuracy and a low attack success rate. In this paper, we propose a stealth and practical ViT-specific backdoor attack . Rather than an area-wise trigger used by CNN-specific backdoor attacks, TrojViT generates a patch-wise trigger designed to build a Trojan composed of some vulnerable bits on the parameters of a ViT stored in DRAM memory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Advanced Memory and Neural Computing
MethodsTest
