Transformaly -- Two (Feature Spaces) Are Better Than One

Matan Jacob Cohen; Shai Avidan

arXiv:2112.04185·cs.CV·July 19, 2022

Transformaly -- Two (Feature Spaces) Are Better Than One

Matan Jacob Cohen, Shai Avidan

PDF

1 Repo

TL;DR

Transformaly enhances anomaly detection by combining pre-trained Vision Transformer features with teacher-student fine-tuned features, leveraging normal sample information for improved detection accuracy.

Contribution

The paper introduces a novel method that uses a teacher-student training approach to extract complementary features from a pre-trained ViT for anomaly detection.

Findings

01

Achieves state-of-the-art AUROC results in unimodal settings.

02

Outperforms existing methods in multimodal anomaly detection.

03

Effectively utilizes normal samples to improve detection accuracy.

Abstract

Anomaly detection is a well-established research area that seeks to identify samples outside of a predetermined distribution. An anomaly detection pipeline is comprised of two main stages: (1) feature extraction and (2) normality score assignment. Recent papers used pre-trained networks for feature extraction achieving state-of-the-art results. However, the use of pre-trained networks does not fully-utilize the normal samples that are available at train time. This paper suggests taking advantage of this information by using teacher-student training. In our setting, a pretrained teacher network is used to train a student network on the normal training samples. Since the student network is trained only on normal samples, it is expected to deviate from the teacher network in abnormal cases. This difference can serve as a complementary representation to the pre-trained feature vector. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MatanCohen1/Transformaly
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Position-Wise Feed-Forward Layer · Layer Normalization · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Vision Transformer