PTQ4VM: Post-Training Quantization for Visual Mamba

Younghyun Cho; Changhun Lee; Seonggon Kim; Eunhyeok Park

arXiv:2412.20386·cs.CV·April 8, 2025

PTQ4VM: Post-Training Quantization for Visual Mamba

Younghyun Cho, Changhun Lee, Seonggon Kim, Eunhyeok Park

PDF

Open Access 1 Repo

TL;DR

This paper introduces PTQ4VM, a post-training quantization method tailored for Visual Mamba, addressing its unique quantization challenges to enable fast, low-accuracy-loss model deployment on GPUs.

Contribution

It presents the first quantization approach specifically designed for Visual Mamba, with novel strategies like PTS and JLSS to improve quantization efficiency and effectiveness.

Findings

01

Achieves up to 1.83x GPU speedup with negligible accuracy loss.

02

Converts pretrained Visual Mamba models in under 15 minutes.

03

Effectively addresses quantization challenges unique to Visual Mamba.

Abstract

Visual Mamba is an approach that extends the selective space state model, Mamba, to vision tasks. It processes image tokens sequentially in a fixed order, accumulating information to generate outputs. Despite its growing popularity for delivering high-quality outputs at a low computational cost across various tasks, Visual Mamba is highly susceptible to quantization, which makes further performance improvements challenging. Our analysis reveals that the fixed token access order in Visual Mamba introduces unique quantization challenges, which we categorize into three main issues: 1) token-wise variance, 2) channel-wise outliers, and 3) a long tail of activations. To address these challenges, we propose Post-Training Quantization for Visual Mamba (PTQ4VM), which introduces two key strategies: Per-Token Static (PTS) quantization and Joint Learning of Smoothing Scale and Step Size (JLSS).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

younghyun197/ptq4vm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Video Stabilization · Advanced Vision and Imaging

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces