Block-based Symmetric Pruning and Fusion for Efficient Vision Transformers
Yi-Kuan Hsieh, Jun-Wei Hsieh, Xin Li, Yu-Ming Chang, Yu-Chee Tseng

TL;DR
This paper introduces BSPF-ViT, a novel method for jointly pruning query and key tokens in Vision Transformers, which reduces computational costs while maintaining or improving accuracy through symmetric pruning and fusion.
Contribution
It proposes a block-based symmetric pruning and fusion technique that jointly optimizes token pruning considering interactions, leading to significant efficiency gains without accuracy loss.
Findings
Outperforms state-of-the-art ViT pruning methods in accuracy.
Reduces computational overhead by 50%.
Achieves 40% speedup with improved accuracy.
Abstract
Vision Transformer (ViT) has achieved impressive results across various vision tasks, yet its high computational cost limits practical applications. Recent methods have aimed to reduce ViT's complexity by pruning unimportant tokens. However, these techniques often sacrifice accuracy by independently pruning query (Q) and key (K) tokens, leading to performance degradation due to overlooked token interactions. To address this limitation, we introduce a novel {\bf Block-based Symmetric Pruning and Fusion} for efficient ViT (BSPF-ViT) that optimizes the pruning of Q/K tokens jointly. Unlike previous methods that consider only a single direction, our approach evaluates each token and its neighbors to decide which tokens to retain by taking token interaction into account. The retained tokens are compressed through a similarity fusion step, preserving key information while reducing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · CCD and CMOS Imaging Sensors · Image and Object Detection Techniques
