Token Cropr: Faster ViTs for Quite a Few Tasks
Benjamin Bergner, Christoph Lippert, Aravindh Mahendran

TL;DR
This paper introduces a fast token pruning method for Vision Transformers that uses auxiliary prediction heads to select relevant tokens, achieving significant speedups across various vision tasks with minimal performance loss.
Contribution
A novel token pruning approach using auxiliary heads for end-to-end token relevance prediction, applicable to multiple vision tasks, and achieving high efficiency with minimal accuracy impact.
Findings
Achieves 1.5 to 4x speedup across tasks
On ADE20k, 2x speedup with only 0.1 mIoU drop
Method maintains high performance with fast inference
Abstract
The adoption of Vision Transformers (ViTs) in resource-constrained applications necessitates improvements in inference throughput. To this end several token pruning and merging approaches have been proposed that improve efficiency by successively reducing the number of tokens. However, it remains an open problem to design a token reduction method that is fast, maintains high performance, and is applicable to various vision tasks. In this work, we present a token pruner that uses auxiliary prediction heads that learn to select tokens end-to-end based on task relevance. These auxiliary heads can be removed after training, leading to throughput close to that of a random pruner. We evaluate our method on image classification, semantic segmentation, object detection, and instance segmentation, and show speedups of 1.5 to 4x with small drops in performance. As a best case, on the ADE20k…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Neural Network Applications · Interactive and Immersive Displays
MethodsPruning
