Mitigating Information Loss under High Pruning Rates for Efficient Large Vision Language Models

Mingyu Fu; Wei Suo; Ji Ma; Lin Yuanbo Wu; Peng Wang; Yanning Zhang

arXiv:2508.01236·cs.CV·August 5, 2025

Mitigating Information Loss under High Pruning Rates for Efficient Large Vision Language Models

Mingyu Fu, Wei Suo, Ji Ma, Lin Yuanbo Wu, Peng Wang, Yanning Zhang

PDF

Open Access

TL;DR

This paper introduces ACCM, a method that uses adaptive captioning to preserve visual information in large vision language models, enabling high pruning rates with minimal performance loss and reduced computational costs.

Contribution

The paper proposes ACCM, a novel adaptive content compensation technique that employs self-supervised captioning and selection to mitigate information loss during pruning in LVLMs.

Findings

01

ACCM outperforms existing methods across seven benchmarks.

02

Achieves 20.6% higher accuracy with 6.5% fewer FLOPs.

03

Effectively preserves visual information at high pruning rates.

Abstract

Despite the great success of Large Vision Language Models (LVLMs), their high computational cost severely limits their broad applications. The computational cost of LVLMs mainly stems from the visual sequence of the input, which consists of hundreds or even thousands of tokens. Although existing methods have made progress by removing redundant tokens, they suffer from severe performance degradation with high pruning rates due to the loss of visual information. In this paper, we propose an Adaptive Content Compensation Method (ACCM), which can effectively mitigate the visual information loss via an image caption. Specifically, ACCM comprises two key components: a lightweight caption model and a selector. Firstly the caption model generates question-related descriptions under the guidance of the user instruction. Then the selector further identifies a contextually appropriate caption from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications