HiAP: A Multi-Granular Stochastic Auto-Pruning Framework for Vision Transformers
Andy Li, Aiden Durrant, Milan Markovic, Georgios Leontidis

TL;DR
HiAP is a hierarchical auto-pruning framework for vision transformers that efficiently discovers optimal sub-networks during end-to-end training, reducing computational costs without manual heuristics.
Contribution
It introduces a multi-granular stochastic relaxation approach with macro and micro gates, enabling simultaneous pruning at multiple levels in a single training phase.
Findings
Achieves competitive accuracy-efficiency trade-offs on ImageNet.
Simplifies the pruning pipeline compared to multi-stage methods.
Effectively balances FLOPs reduction and model performance.
Abstract
Vision Transformers require significant computational resources and memory bandwidth, severely limiting their deployment on edge devices. While recent structured pruning methods successfully reduce theoretical FLOPs, they typically operate at a single structural granularity and rely on complex, multi-stage pipelines with post-hoc thresholding to satisfy sparsity budgets. In this paper, we propose Hierarchical Auto-Pruning (HiAP), a continuous relaxation framework that discovers optimal sub-networks in a single end-to-end training phase without requiring manual importance heuristics or predefined per-layer sparsity targets. HiAP introduces stochastic Gumbel-Sigmoid gates at multiple granularities: macro-gates to prune entire attention heads and FFN blocks, and micro-gates to selectively prune intra-head dimensions and FFN neurons. By optimizing both levels simultaneously, HiAP addresses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · CCD and CMOS Imaging Sensors
