VT-ADL: A Vision Transformer Network for Image Anomaly Detection and   Localization

Pankaj Mishra; Riccardo Verk; Daniele Fornasier; Claudio Piciarelli,; Gian Luca Foresti

arXiv:2104.10036·cs.CV·November 3, 2021

VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization

Pankaj Mishra, Riccardo Verk, Daniele Fornasier, Claudio Piciarelli,, Gian Luca Foresti

PDF

1 Repo

TL;DR

This paper introduces VT-ADL, a transformer-based model for detecting and localizing image anomalies, leveraging patch embeddings and a Gaussian mixture model, and provides a new industrial anomaly dataset for benchmarking.

Contribution

The paper proposes a novel transformer-based approach combining reconstruction and patch embedding for improved anomaly detection and localization, along with a new real-world industrial dataset.

Findings

01

Outperforms existing methods on MNIST and MVTec datasets

02

Effectively localizes anomalies using transformer and Gaussian mixture network

03

Provides a new dataset for industrial anomaly detection

Abstract

We present a transformer-based image anomaly detection and localization network. Our proposed model is a combination of a reconstruction-based approach and patch embedding. The use of transformer networks helps to preserve the spatial information of the embedded patches, which are later processed by a Gaussian mixture density network to localize the anomalous areas. In addition, we also publish BTAD, a real-world industrial anomaly dataset. Our results are compared with other state-of-the-art algorithms using publicly available datasets like MNIST and MVTec.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pankajmishra000/VT-ADL
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.