Understanding vision transformer robustness through the lens of out-of-distribution detection
Joey Kuang, Alexander Wong

TL;DR
This paper investigates how quantized vision transformers behave in out-of-distribution detection, revealing that large-scale pretraining can reduce quantization robustness in OOD scenarios.
Contribution
It provides the first analysis of quantized vision transformers' OOD detection performance, highlighting the impact of pretraining datasets on quantization robustness.
Findings
Quantized models show initial instabilities, especially on larger datasets.
Pretraining on large datasets can decrease low-bit quantization robustness in OOD detection.
Data augmentation may improve quantization robustness in OOD scenarios.
Abstract
Vision transformers have shown remarkable performance in vision tasks, but enabling them for accessible and real-time use is still challenging. Quantization reduces memory and inference costs at the risk of performance loss. Strides have been made to mitigate low precision issues mainly by understanding in-distribution (ID) task behaviour, but the attention mechanism may provide insight on quantization attributes by exploring out-of-distribution (OOD) situations. We investigate the behaviour of quantized small-variant popular vision transformers (DeiT, DeiT3, and ViT) on common OOD datasets. ID analyses show the initial instabilities of 4-bit models, particularly of those trained on the larger ImageNet-22k, as the strongest FP32 model, DeiT3, sharply drop 17% from quantization error to be one of the weakest 4-bit models. While ViT shows reasonable quantization robustness for ID…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Advanced Neural Network Applications · Advanced Memory and Neural Computing
