LPViT: Low-Power Semi-structured Pruning for Vision Transformers
Kaixin Xu, Zhe Wang, Chunyun Chen, Xue Geng, Jie Lin, Mohamed M. Sabry, Aly, Xulei Yang, Min Wu, Xiaoli Li, and Weisi Lin

TL;DR
This paper introduces LPViT, a block-structured pruning method for vision transformers that reduces resource consumption while maintaining accuracy, achieving significant speedup and power savings through hardware-aware optimization.
Contribution
It proposes a novel block pruning scheme with a hardware-aware learning objective and a lightweight post-training algorithm for efficient ViT model compression.
Findings
Achieves 3.93x speedup on hardware and GPUs for DeiT-B.
Reduces power consumption by 1.4x on GPUs.
Maintains competitive accuracy compared to other pruning methods.
Abstract
Vision transformers have emerged as a promising alternative to convolutional neural networks for various image analysis tasks, offering comparable or superior performance. However, one significant drawback of ViTs is their resource-intensive nature, leading to increased memory footprint, computation complexity, and power consumption. To democratize this high-performance technology and make it more environmentally friendly, it is essential to compress ViT models, reducing their resource requirements while maintaining high performance. In this paper, we introduce a new block-structured pruning to address the resource-intensive issue for ViTs, offering a balanced trade-off between accuracy and hardware acceleration. Unlike unstructured pruning or channel-wise structured pruning, block pruning leverages the block-wise structure of linear layers, resulting in more efficient matrix…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing · Advanced Optical Sensing Technologies
MethodsPruning
