Enhanced Security against Adversarial Examples Using a Random Ensemble of Encrypted Vision Transformer Models
Ryota Iijima, Miki Tanaka, Sayaka Shiota, Hitoshi Kiya

TL;DR
This paper introduces a random ensemble of encrypted Vision Transformer models to significantly improve robustness against both black-box and white-box adversarial attacks in deep neural networks.
Contribution
It proposes a novel ensemble method of encrypted ViT models that enhances adversarial robustness beyond existing encryption-based defenses.
Findings
Enhanced robustness against black-box attacks
Improved defense against white-box attacks
Outperforms conventional encryption methods
Abstract
Deep neural networks (DNNs) are well known to be vulnerable to adversarial examples (AEs). In addition, AEs have adversarial transferability, which means AEs generated for a source model can fool another black-box model (target model) with a non-trivial probability. In previous studies, it was confirmed that the vision transformer (ViT) is more robust against the property of adversarial transferability than convolutional neural network (CNN) models such as ConvMixer, and moreover encrypted ViT is more robust than ViT without any encryption. In this article, we propose a random ensemble of encrypted ViT models to achieve much more robust models. In experiments, the proposed scheme is verified to be more robust against not only black-box attacks but also white-box ones than convention methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Chaos-based Image/Signal Encryption
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Residual Connection · Layer Normalization · Dense Connections · Vision Transformer
