How Well Do Vision Transformers (VTs) Transfer To The Non-Natural Image   Domain? An Empirical Study Involving Art Classification

Vincent Tonkes; Matthia Sabatelli

arXiv:2208.04693·cs.CV·August 10, 2022

How Well Do Vision Transformers (VTs) Transfer To The Non-Natural Image Domain? An Empirical Study Involving Art Classification

Vincent Tonkes, Matthia Sabatelli

PDF

Open Access 1 Repo

TL;DR

This study empirically evaluates the transfer learning capabilities of Vision Transformers (VTs) in non-natural image domains, specifically art classification, comparing their performance to CNNs.

Contribution

It provides the first comprehensive comparison of VTs and CNNs in transferring learned representations from natural to non-natural images.

Findings

01

VTs outperform CNNs in transfer learning for art classification

02

VTs demonstrate strong generalization across different non-natural image tasks

03

Transformers are more effective feature extractors than CNNs in this context

Abstract

Vision Transformers (VTs) are becoming a valuable alternative to Convolutional Neural Networks (CNNs) when it comes to problems involving high-dimensional and spatially organized inputs such as images. However, their Transfer Learning (TL) properties are not yet well studied, and it is not fully known whether these neural architectures can transfer across different domains as well as CNNs. In this paper we study whether VTs that are pre-trained on the popular ImageNet dataset learn representations that are transferable to the non-natural image domain. To do so we consider three well-studied art classification problems and use them as a surrogate for studying the TL potential of four popular VTs. Their performance is extensively compared against that of four common CNNs across several TL experiments. Our results show that VTs exhibit strong generalization properties and that these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

indooradventurer/vittransferlearningforartclassification
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning