Training data-efficient image transformers & distillation through   attention

Hugo Touvron; Matthieu Cord; Matthijs Douze; Francisco Massa,; Alexandre Sablayrolles; Herv\'e J\'egou

arXiv:2012.12877·cs.CV·January 18, 2021·132 cites

Training data-efficient image transformers & distillation through attention

Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa,, Alexandre Sablayrolles, Herv\'e J\'egou

PDF

Open Access 5 Repos 10 Models 2 Videos

TL;DR

This paper introduces a data-efficient vision transformer trained solely on ImageNet within three days, utilizing a novel attention-based distillation method that achieves competitive accuracy without external data.

Contribution

The work presents a convolution-free transformer trained on ImageNet in limited time and introduces a teacher-student distillation strategy using attention tokens for improved learning.

Findings

01

Achieved 83.1% top-1 accuracy on ImageNet with 86M parameters.

02

Introduced a token-based distillation method for transformers.

03

Reported competitive results with convnets on ImageNet and transfer tasks.

Abstract

Recently, neural networks purely based on attention were shown to address image understanding tasks such as image classification. However, these visual transformers are pre-trained with hundreds of millions of images using an expensive infrastructure, thereby limiting their adoption. In this work, we produce a competitive convolution-free transformer by training on Imagenet only. We train them on a single computer in less than 3 days. Our reference vision transformer (86M parameters) achieves top-1 accuracy of 83.1% (single-crop evaluation) on ImageNet with no external data. More importantly, we introduce a teacher-student strategy specific to transformers. It relies on a distillation token ensuring that the student learns from the teacher through attention. We show the interest of this token-based distillation, especially when using a convnet as a teacher. This leads us to report…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

Data-efficient Image Transformers EXPLAINED! Facebook AI's DeiT paper· youtube

Training data-efficient image transformers & distillation through attention· slideslive

Taxonomy

TopicsCurrency Recognition and Detection

Methods([FAQ-Expedia])What does nonrefundable mean on Expedia? · {{off-peak days}}what does refundable option mean on expedia? · 15 Quick Methods to Contact How Do I Talk to Someone at Spirit Airlines®: Full Phone & Chat Guide · Nine Convenient Ways to Connect with Expedia’s Customer Service Team · Eight Proven Tips to Resolve Your Travel Concerns with Expedia’s Phone Support · Twenty Six Quick Fixes for Faster Assistance from Expedia’s Phone Support-24/7 · Five Easy Tricks to Contact Expedia Support by Phone and Get Help Faster · Ten Quick Tips to Get in Touch with Expedia Customer Support via Call · 5 Guaranteed Ways to Avoid Delays and Talk to a Live Agent at Expedia for Immediate Travel Support Via Phone · 10 Accessible Ways to Reach Expedia Support via Call, Chat, or Email Support -Get Help Fast