TransPrune: Token Transition Pruning for Efficient Large Vision-Language Model

Ao Li; Yuxiang Duan; Jinghui Zhang; Congbo Ma; Yutong Xie; Gustavo Carneiro; Mohammad Yaqub; Hu Wang

arXiv:2507.20630·cs.CV·November 18, 2025

TransPrune: Token Transition Pruning for Efficient Large Vision-Language Model

Ao Li, Yuxiang Duan, Jinghui Zhang, Congbo Ma, Yutong Xie, Gustavo Carneiro, Mohammad Yaqub, Hu Wang

PDF

Open Access

TL;DR

TransPrune introduces a novel, training-free token pruning method for large vision-language models that leverages token transition signals to efficiently reduce computational costs while maintaining high performance.

Contribution

It proposes a new token importance criterion based on token transitions, overcoming limitations of attention-based methods, and demonstrates its effectiveness across multiple benchmarks.

Findings

01

TransPrune reduces inference TFLOPs by over 50%.

02

It achieves comparable performance to original LVLMs on eight benchmarks.

03

TTV alone is an effective, attention-free token importance measure.

Abstract

Large Vision-Language Models (LVLMs) have advanced multimodal learning but face high computational costs due to the large number of visual tokens, motivating token pruning to improve inference efficiency. The key challenge lies in identifying which tokens are truly important. Most existing approaches rely on attention-based criteria to estimate token importance. However, they inherently suffer from certain limitations, such as positional bias. In this work, we explore a new perspective on token importance based on token transitions in LVLMs. We observe that the transition of token representations provides a meaningful signal of semantic information. Based on this insight, we propose TransPrune, a training-free and efficient token pruning method. Specifically, TransPrune progressively prunes tokens by assessing their importance through a combination of Token Transition Variation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications