Pruning by Block Benefit: Exploring the Properties of Vision Transformer Blocks during Domain Adaptation
Patrick Glandorf, Bodo Rosenhahn

TL;DR
This paper introduces P3B, a novel pruning method for Vision Transformers that effectively reduces model complexity during domain adaptation by assessing block contributions, achieving high sparsity with minimal accuracy loss.
Contribution
The paper proposes P3B, a new pruning approach that uses block-level contribution metrics to improve transfer learning performance and resource efficiency in Vision Transformers.
Findings
P3B achieves up to 70% parameter reduction with only 0.64% accuracy loss.
P3B outperforms classical pruning methods in transfer learning tasks.
P3B maintains high performance even at high sparsity levels.
Abstract
Vision Transformer have set new benchmarks in several tasks, but these models come with the lack of high computational costs which makes them impractical for resource limited hardware. Network pruning reduces the computational complexity by removing less important operations while maintaining performance. However, pruning a model on an unseen data domain, leads to a misevaluation of weight significance, resulting in suboptimal resource assignment. In this work, we find that task-sensitive layers initially fail to improve the feature representation on downstream tasks, leading to performance loss for early pruning decisions. To address this problem, we introduce Pruning by Block Benefit (P3B), a pruning method that utilizes the relative contribution on block level to globally assign parameter resources. P3B identifies low-impact components to reduce parameter allocation while preserving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Advanced Memory and Neural Computing
