Evaluating Vision Transformer Models for Visual Quality Control in   Industrial Manufacturing

Miriam Alber; Christoph H\"ones; Patrick Baier

arXiv:2411.14953·cs.CV·November 25, 2024

Evaluating Vision Transformer Models for Visual Quality Control in Industrial Manufacturing

Miriam Alber, Christoph H\"ones, Patrick Baier

PDF

1 Repo

TL;DR

This paper reviews and evaluates vision transformer models combined with anomaly detection methods for industrial quality control, aiming to identify efficient, small, and fast models suitable for real-world manufacturing applications.

Contribution

It provides a comprehensive evaluation of state-of-the-art vision transformer models with anomaly detection, offering practical guidelines for selecting suitable models under hardware constraints.

Findings

01

Transformer-based models can achieve high detection accuracy.

02

Certain model combinations are more efficient for real-time applications.

03

Guidelines help practitioners choose models based on use-case and hardware.

Abstract

One of the most promising use-cases for machine learning in industrial manufacturing is the early detection of defective products using a quality control system. Such a system can save costs and reduces human errors due to the monotonous nature of visual inspections. Today, a rich body of research exists which employs machine learning methods to identify rare defective products in unbalanced visual quality control datasets. These methods typically rely on two components: A visual backbone to capture the features of the input image and an anomaly detection algorithm that decides if these features are within an expected distribution. With the rise of transformer architecture as visual backbones of choice, there exists now a great variety of different combinations of these two components, ranging all along the trade-off between detection quality and inference time. Facing this variety,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

visiontransformerad/vit-ad
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Linear Layer · Residual Connection · Softmax · Multi-Head Attention · Dense Connections · Layer Normalization · Vision Transformer