MPM: Mutual Pair Merging for Efficient Vision Transformers

Simon Rav\'e; Pejman Rasti; David Rousseau

arXiv:2604.05718·cs.CV·April 8, 2026

MPM: Mutual Pair Merging for Efficient Vision Transformers

Simon Rav\'e, Pejman Rasti, David Rousseau

PDF

TL;DR

This paper introduces Mutual Pair Merging (MPM), a training-free token aggregation method that accelerates vision transformers for segmentation by reducing latency and increasing throughput with minimal accuracy loss.

Contribution

MPM is a simple, reconstruction-aware, training-free token merging technique that improves end-to-end latency and throughput in vision transformer segmentation tasks.

Findings

01

MPM reduces per-image latency by up to 60% on Raspberry Pi 5.

02

MPM increases throughput by up to 20% on NVIDIA H100 with FlashAttention-2.

03

MPM maintains mIoU drop below 3% while improving speed and efficiency.

Abstract

Decreasing sequence length is a common way to accelerate transformers, but prior token reduction work often targets classification and reports proxy metrics rather than end-to-end latency. For semantic segmentation, token reduction is further constrained by the need to reconstruct dense, pixel-aligned features, and on modern accelerators the overhead of computing merge maps can erase expected gains. We propose Mutual Pair Merging (MPM), a training-free token aggregation module that forms mutual nearest-neighbor pairs in cosine space, averages each pair, and records a merge map enabling a gather-based reconstruction before the decoder so that existing segmentation heads can be used unchanged. MPM introduces no learned parameters and no continuous compression knob (no keep-rate or threshold). The speed-accuracy trade-off is set by a discrete insertion schedule. We benchmark end-to-end…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.