Token Compensator: Altering Inference Cost of Vision Transformer without   Re-Tuning

Shibo Jie; Yehui Tang; Jianyuan Guo; Zhi-Hong Deng; Kai Han; Yunhe; Wang

arXiv:2408.06798·cs.CV·August 14, 2024

Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning

Shibo Jie, Yehui Tang, Jianyuan Guo, Zhi-Hong Deng, Kai Han, Yunhe, Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces Token Compensator (ToCom), a plugin that improves Vision Transformer inference performance across different compression levels without re-tuning, enabling flexible use of token compression methods.

Contribution

We propose a universal plugin, ToCom, that compensates for performance drops caused by token compression mismatches without additional training.

Findings

01

Up to 2.3% accuracy improvement on CIFAR100

02

Effective across 20 downstream tasks

03

Works with off-the-shelf models without re-tuning

Abstract

Token compression expedites the training and inference of Vision Transformers (ViTs) by reducing the number of the redundant tokens, e.g., pruning inattentive tokens or merging similar tokens. However, when applied to downstream tasks, these approaches suffer from significant performance drop when the compression degrees are mismatched between training and inference stages, which limits the application of token compression on off-the-shelf trained models. In this paper, we propose a model arithmetic framework to decouple the compression degrees between the two stages. In advance, we additionally perform a fast parameter-efficient self-distillation stage on the pre-trained models to obtain a small plugin, called Token Compensator (ToCom), which describes the gap between models across different compression degrees. During inference, ToCom can be directly inserted into any downstream…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jieshibo/tocom
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing

MethodsPruning