LGViT: Dynamic Early Exiting for Accelerating Vision Transformer

Guanyu Xu; Jiawei Hao; Li Shen; Han Hu; Yong Luo; Hui Lin; Jialie Shen

arXiv:2308.00255·cs.CV·August 2, 2023

LGViT: Dynamic Early Exiting for Accelerating Vision Transformer

Guanyu Xu, Jiawei Hao, Li Shen, Han Hu, Yong Luo, Hui Lin, Jialie Shen

PDF

1 Repo 3 Models

TL;DR

This paper introduces LGViT, a novel early exiting framework for vision transformers that balances efficiency and accuracy, achieving significant speed-up with minimal performance loss on multiple datasets.

Contribution

The paper proposes a new early exiting method for ViTs with heterogeneous heads and a two-stage training scheme, improving inference speed while maintaining accuracy.

Findings

01

LGViT achieves approximately 1.8× speed-up.

02

Extensive experiments validate LGViT's effectiveness across multiple ViT backbones.

03

The method maintains competitive performance with reduced inference time.

Abstract

Recently, the efficient deployment and acceleration of powerful vision transformers (ViTs) on resource-limited edge devices for providing multimedia services have become attractive tasks. Although early exiting is a feasible solution for accelerating inference, most works focus on convolutional neural networks (CNNs) and transformer models in natural language processing (NLP).Moreover, the direct application of early exiting methods to ViTs may result in substantial performance degradation. To tackle this challenge, we systematically investigate the efficacy of early exiting in ViTs and point out that the insufficient feature representations in shallow internal classifiers and the limited ability to capture target semantic information in deep internal classifiers restrict the performance of these methods. We then propose an early exiting framework for general ViTs termed LGViT, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lostsword/LGViT
pytorch

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsEarly exiting using confidence measures · Focus