QuarterMap: Efficient Post-Training Token Pruning for Visual State Space Models

Tien-Yu Chi; Hung-Yueh Chiang; Diana Marculescu; Kai-Chiang Wu

arXiv:2507.09514·cs.CV·July 15, 2025

QuarterMap: Efficient Post-Training Token Pruning for Visual State Space Models

Tien-Yu Chi, Hung-Yueh Chiang, Diana Marculescu, Kai-Chiang Wu

PDF

TL;DR

QuarterMap is a post-training pruning technique that reduces spatial redundancy in state space models, significantly improving inference speed on vision tasks without retraining or accuracy loss.

Contribution

It introduces a novel activation pruning method tailored for SSM-based vision models, enhancing throughput efficiently without retraining.

Findings

01

Up to 11% speedup on VMamba with minimal accuracy drop

02

Effective on medical imaging tasks with consistent throughput gains

03

Outperforms token merging methods like ToMe in efficiency

Abstract

State space models (SSMs) reduce the quadratic complexity of transformers by leveraging linear recurrence. Recently, VMamba has emerged as a strong SSM-based vision backbone, yet remains bottlenecked by spatial redundancy in its four-directional scan. We propose QuarterMap, a post-training activation pruning method that removes redundant spatial activations before scanning and restores dimensions via nearest-neighbor upsampling. Our method improves throughput without retraining. On ImageNet-1K, QuarterMap achieves up to 11% speedup on VMamba with less than 0.9% accuracy drop, and yields similar gains on ADE20K segmentation. Beyond VMamba, we validate QuarterMap on MedMamba, a domain-specific model that shares the same four-directional scanning structure, where it consistently improves throughput while preserving accuracy across multiple medical imaging tasks. Compared to token merging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.