TAP-ViTs: Task-Adaptive Pruning for On-Device Deployment of Vision Transformers
Zhibo Wang, Zuoyuan Zhang, Xiaoyi Pang, Qile Zhang, Xuanyi Hao, Shuguo Zhuo, Peng Sun

TL;DR
TAP-ViTs introduces a privacy-preserving, device-specific pruning framework for Vision Transformers that adapts to individual device constraints without needing local data, improving efficiency and task performance.
Contribution
The paper proposes a novel task-adaptive pruning method for ViTs that uses GMM-based privacy-preserving data approximation and dual importance evaluation for device-specific model compression.
Findings
Outperforms existing pruning methods at similar compression levels
Enables privacy-preserving, device-specific ViT pruning without local data access
Achieves better task accuracy and efficiency across multiple datasets and models
Abstract
Vision Transformers (ViTs) have demonstrated strong performance across a wide range of vision tasks, yet their substantial computational and memory demands hinder efficient deployment on resource-constrained mobile and edge devices. Pruning has emerged as a promising direction for reducing ViT complexity. However, existing approaches either (i) produce a single pruned model shared across all devices, ignoring device heterogeneity, or (ii) rely on fine-tuning with device-local data, which is often infeasible due to limited on-device resources and strict privacy constraints. As a result, current methods fall short of enabling task-customized ViT pruning in privacy-preserving mobile computing settings. This paper introduces TAP-ViTs, a novel task-adaptive pruning framework that generates device-specific pruned ViT models without requiring access to any raw local data. Specifically, to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · IoT and Edge/Fog Computing
