How to train your ViT for OOD Detection

Maximilian Mueller; Matthias Hein

arXiv:2405.17447·cs.CV·May 29, 2024

How to train your ViT for OOD Detection

Maximilian Mueller, Matthias Hein

PDF

Open Access

TL;DR

This paper investigates how different pretraining and finetuning schemes affect VisionTransformers' ability to detect out-of-distribution samples, providing insights and best practices for improving OOD detection performance.

Contribution

It systematically analyzes the impact of pretraining and finetuning methods on ViT OOD detection, proposing a best-practice training recipe.

Findings

01

Pretraining type significantly influences OOD detection performance.

02

Certain training schemes are effective only for specific out-distribution types.

03

A recommended training recipe improves ViT OOD detection across benchmarks.

Abstract

VisionTransformers have been shown to be powerful out-of-distribution detectors for ImageNet-scale settings when finetuned from publicly available checkpoints, often outperforming other model types on popular benchmarks. In this work, we investigate the impact of both the pretraining and finetuning scheme on the performance of ViTs on this task by analyzing a large pool of models. We find that the exact type of pretraining has a strong impact on which method works well and on OOD detection performance in general. We further show that certain training schemes might only be effective for a specific type of out-distribution, but not in general, and identify a best-practice training recipe.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications