Accelerating Vision Transformer Training via a Patch Sampling Schedule

Bradley McDanel; Chi Phuong Huynh

arXiv:2208.09520·cs.CV·August 23, 2022·1 cites

Accelerating Vision Transformer Training via a Patch Sampling Schedule

Bradley McDanel, Chi Phuong Huynh

PDF

Open Access 1 Repo

TL;DR

This paper proposes a Patch Sampling Schedule (PSS) for Vision Transformers that reduces training time by selectively sampling patches, maintaining accuracy and improving robustness during inference.

Contribution

The introduction of PSS allows dynamic patch sampling during training, leading to faster training with minimal accuracy loss and increased inference robustness.

Findings

01

0.26% accuracy reduction with 31% less training time

02

Enhanced robustness to patch sampling during inference

03

Effective for models trained from scratch and pre-trained

Abstract

We introduce the notion of a Patch Sampling Schedule (PSS), that varies the number of Vision Transformer (ViT) patches used per batch during training. Since all patches are not equally important for most vision objectives (e.g., classification), we argue that less important patches can be used in fewer training iterations, leading to shorter training time with minimal impact on performance. Additionally, we observe that training with a PSS makes a ViT more robust to a wider patch sampling range during inference. This allows for a fine-grained, dynamic trade-off between throughput and accuracy during inference. We evaluate using PSSs on ViTs for ImageNet both trained from scratch and pre-trained using a reconstruction loss function. For the pre-trained model, we achieve a 0.26% reduction in classification accuracy for a 31% reduction in training time (from 25 to 17 hours) compared to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bradmcdanel/pss
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · COVID-19 diagnosis using AI

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Dense Connections · Vision Transformer · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Dropout