Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for Downstream Tasks
Haiyan Zhao, Tianyi Zhou, Guodong Long, Jing Jiang, Chengqi Zhang

TL;DR
This paper introduces Meta-Vote Pruning (MVP), a method that leverages pruned models of similar tasks to efficiently create effective models for new downstream tasks, reducing pruning iterations and improving performance.
Contribution
The paper proposes a novel meta-pruning approach that reuses pruned models from similar tasks to accelerate and improve pruning for new tasks in pre-trained models.
Findings
MVP significantly reduces pruning iterations needed for new tasks.
Pruned models for similar tasks have overlapping structures that can be exploited.
MVP achieves better accuracy and efficiency compared to traditional pruning methods.
Abstract
As a few large-scale pre-trained models become the major choices of various applications, new challenges arise for model pruning, e.g., can we avoid pruning the same model from scratch for every downstream task? How to reuse the pruning results of previous tasks to accelerate the pruning for a new task? To address these challenges, we create a small model for a new task from the pruned models of similar tasks. We show that a few fine-tuning steps on this model suffice to produce a promising pruned-model for the new task. We study this ''meta-pruning'' from nearest tasks on two major classes of pre-trained models, convolutional neural network (CNN) and vision transformer (ViT), under a limited budget of pruning iterations. Our study begins by investigating the overlap of pruned models for similar tasks and how the overlap changes over different layers and blocks. Inspired by these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
MethodsAttention Is All You Need · Pruning · Linear Layer · Softmax · Layer Normalization · Multi-Head Attention · Dense Connections · Residual Connection · Vision Transformer
