LOP: Learning Optimal Pruning for Efficient On-Demand MLLMs Scaling

Zhihan Zhang; Xiang Pan; Hongchen Wei; Zhenzhong Chen

arXiv:2506.12826·cs.CV·June 17, 2025

LOP: Learning Optimal Pruning for Efficient On-Demand MLLMs Scaling

Zhihan Zhang, Xiang Pan, Hongchen Wei, Zhenzhong Chen

PDF

Open Access

TL;DR

LOP introduces an efficient neural pruning framework that learns optimal strategies directly from constraints, significantly reducing computational overhead and outperforming existing methods in deploying multimodal large language models.

Contribution

The paper presents a novel pruning approach that trains neural networks to predict pruning strategies without iterative search, enabling fast and adaptive model compression.

Findings

01

LOP achieves up to 1000x speedup over traditional methods.

02

LOP outperforms state-of-the-art pruning techniques in multiple tasks.

03

The method effectively adapts to various pruning constraints.

Abstract

Structural pruning techniques are essential for deploying multimodal large language models (MLLMs) across various hardware platforms, from edge devices to cloud servers. However, current pruning methods typically determine optimal strategies through iterative search processes, resulting in substantial computational overhead for on-demand MLLMs adaptation. To address this challenge, we propose LOP, an efficient neural pruning framework that learns optimal pruning strategies from the target pruning constraint, eliminating the need for computationally expensive search-based methods. LOP approach trains autoregressive neural networks (NNs) to directly predict layer-wise pruning strategies adaptive to the target pruning constraint, eliminating the time-consuming iterative searches. Experimental results across multiple tasks show that LOP outperforms state-of-the-art pruning methods in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems