LayerShuffle: Enhancing Robustness in Vision Transformers by Randomizing Layer Execution Order
Matthias Freiberger, Peter Kun, Anders Sundnes L{\o}vlie, Sebastian, Risi

TL;DR
LayerShuffle introduces a training method for vision transformers that randomizes layer execution order, enabling robustness to layer shuffling at test time with minimal accuracy loss, beneficial for distributed or failure-prone environments.
Contribution
The paper proposes novel training approaches for vision transformers that allow arbitrary layer execution order during inference, enhancing robustness and adaptability.
Findings
Models tolerate about 20% accuracy reduction when shuffling layers at test time.
Layer contributions vary depending on their position in the network.
Performance declines gracefully when pruning layers at test time.
Abstract
Due to their architecture and how they are trained, artificial neural networks are typically not robust toward pruning or shuffling layers at test time. However, such properties would be desirable for different applications, such as distributed neural network architectures where the order of execution cannot be guaranteed or parts of the network can fail during inference. In this work, we address these issues through a number of training approaches for vision transformers whose most important component is randomizing the execution order of attention modules at training time. With our proposed approaches, vision transformers are capable to adapt to arbitrary layer execution orders at test time assuming one tolerates a reduction (about 20\%) in accuracy at the same model size. We analyse the feature representations of our trained models as well as how each layer contributes to the models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · CCD and CMOS Imaging Sensors · Advanced Neural Network Applications
MethodsSoftmax · Attention Is All You Need · Pruning
