Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Navin Ranjan, Andreas Savakis

TL;DR
Mix-QSAM introduces a mixed-precision post-training quantization framework for the Segment Anything Model, leveraging layer importance and inter-layer dependencies to optimize bit-width allocation, resulting in improved accuracy and efficiency on segmentation and detection tasks.
Contribution
It proposes a novel mixed-precision PTQ method for SAM using importance and synergy metrics, formulated as an IQP problem for optimal bit-width distribution.
Findings
Achieves up to 20% higher average precision with 6-bit and 4-bit mixed-precision settings.
Outperforms existing PTQ methods on segmentation and detection tasks.
Maintains computational efficiency while improving accuracy.
Abstract
The Segment Anything Model (SAM) is a popular vision foundation model; however, its high computational and memory demands make deployment on resource-constrained devices challenging. While Post-Training Quantization (PTQ) is a practical approach for reducing computational overhead, existing PTQ methods rely on fixed bit-width quantization, leading to suboptimal accuracy and efficiency. To address this limitation, we propose Mix-QSAM, a mixed-precision PTQ framework for SAM. First, we introduce a layer-wise importance score, derived using Kullback-Leibler (KL) divergence, to quantify each layer's contribution to the model's output. Second, we introduce cross-layer synergy, a novel metric based on causal mutual information, to capture dependencies between adjacent layers. This ensures that highly interdependent layers maintain similar bit-widths, preventing abrupt precision mismatches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · CCD and CMOS Imaging Sensors
MethodsSegment Anything Model
