Data-independent Module-aware Pruning for Hierarchical Vision Transformers
Yang He, Joey Tianyi Zhou

TL;DR
This paper introduces DIMAP, a novel pruning method for hierarchical vision transformers that considers module-specific importance and weight distributions, leading to significant model compression with minimal accuracy loss.
Contribution
The paper proposes a data-independent, module-aware pruning approach that fairly compares local attention weights across hierarchical levels and introduces a new weight metric for hierarchical ViTs.
Findings
Achieves 52.5% FLOPs and parameter reduction with only 0.07% accuracy drop on Swin-B.
Outperforms baseline pruning methods by considering module-specific importance.
Maintains or improves accuracy while significantly compressing hierarchical vision transformers.
Abstract
Hierarchical vision transformers (ViTs) have two advantages over conventional ViTs. First, hierarchical ViTs achieve linear computational complexity with respect to image size by local self-attention. Second, hierarchical ViTs create hierarchical feature maps by merging image patches in deeper layers for dense prediction. However, existing pruning methods ignore the unique properties of hierarchical ViTs and use the magnitude value as the weight importance. This approach leads to two main drawbacks. First, the "local" attention weights are compared at a "global" level, which may cause some "locally" important weights to be pruned due to their relatively small magnitude "globally". The second issue with magnitude pruning is that it fails to consider the distinct weight distributions of the network, which are essential for extracting coarse to fine-grained features at various hierarchical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors
MethodsPruning
