Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model

Navin Ranjan; Andreas Savakis

arXiv:2505.04861·cs.CV·May 9, 2025

Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model

Navin Ranjan, Andreas Savakis

PDF

Open Access

TL;DR

Mix-QSAM introduces a mixed-precision post-training quantization framework for the Segment Anything Model, leveraging layer importance and inter-layer dependencies to optimize bit-width allocation, resulting in improved accuracy and efficiency on segmentation and detection tasks.

Contribution

It proposes a novel mixed-precision PTQ method for SAM using importance and synergy metrics, formulated as an IQP problem for optimal bit-width distribution.

Findings

01

Achieves up to 20% higher average precision with 6-bit and 4-bit mixed-precision settings.

02

Outperforms existing PTQ methods on segmentation and detection tasks.

03

Maintains computational efficiency while improving accuracy.

Abstract

The Segment Anything Model (SAM) is a popular vision foundation model; however, its high computational and memory demands make deployment on resource-constrained devices challenging. While Post-Training Quantization (PTQ) is a practical approach for reducing computational overhead, existing PTQ methods rely on fixed bit-width quantization, leading to suboptimal accuracy and efficiency. To address this limitation, we propose Mix-QSAM, a mixed-precision PTQ framework for SAM. First, we introduce a layer-wise importance score, derived using Kullback-Leibler (KL) divergence, to quantify each layer's contribution to the model's output. Second, we introduce cross-layer synergy, a novel metric based on causal mutual information, to capture dependencies between adjacent layers. This ensures that highly interdependent layers maintain similar bit-widths, preventing abrupt precision mismatches…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · CCD and CMOS Imaging Sensors

MethodsSegment Anything Model