SegQuant: A Semantics-Aware and Generalizable Quantization Framework for Diffusion Models
Jiaji Zhang, Ruichao Sun, Hailiang Zhao, Jiaju Wu, Peng Chen, Hao Li, Yuying Liu, Kingsum Chow, Gang Xiong, Shuiguang Deng

TL;DR
SegQuant is a versatile, semantics-aware quantization framework that significantly reduces the computational demands of diffusion models, facilitating deployment in resource-limited environments without retraining.
Contribution
It introduces a unified, adaptive quantization approach combining structural and dual-scale techniques, enhancing cross-model applicability and preserving output quality.
Findings
Achieves high accuracy with low-bit quantization across various diffusion models.
Maintains visual fidelity and model performance post-quantization.
Compatible with mainstream deployment tools and architectures.
Abstract
Diffusion models have demonstrated exceptional generative capabilities but are computationally intensive, posing significant challenges for deployment in resource-constrained or latency-sensitive environments. Quantization offers an effective means to reduce model size and computational cost, with post-training quantization (PTQ) being particularly appealing due to its compatibility with pre-trained models without requiring retraining or training data. However, existing PTQ methods for diffusion models often rely on architecture-specific heuristics that limit their generalizability and hinder integration with industrial deployment pipelines. To address these limitations, we propose SegQuant, a unified quantization framework that adaptively combines complementary techniques to enhance cross-model versatility. SegQuant consists of a segment-aware, graph-based quantization strategy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Mathematical Modeling in Engineering
