Data-Free Group-Wise Fully Quantized Winograd Convolution via Learnable Scales
Shuokai Pan, Gerti Tuzi, Sudarshan Sreeram, Dibakar Gope

TL;DR
This paper introduces a data-free, group-wise quantization method for Winograd convolution in diffusion models, enabling near-lossless inference quality and improved accuracy without domain-specific training.
Contribution
It proposes finetuning only the scale parameters of Winograd matrices for fully quantized convolution, avoiding training data dependence and enhancing model performance.
Findings
Achieves near-lossless quality in text-to-image generation with 8-bit quantization.
Outperforms state-of-the-art Winograd PTQ in ImageNet classification accuracy.
Enables efficient, high-quality diffusion model inference without additional training data.
Abstract
Despite the revolutionary breakthroughs of large-scale text-to-image diffusion models for complex vision and downstream tasks, their extremely high computational and storage costs limit their usability. Quantization of diffusion models has been explored in recent works to reduce compute costs and memory bandwidth usage. To further improve inference time, fast convolution algorithms such as Winograd can be used for convolution layers, which account for a significant portion of computations in diffusion models. However, the significant quality loss of fully quantized Winograd using existing coarser-grained post-training quantization methods, combined with the complexity and cost of finetuning the Winograd transformation matrices for such large models to recover quality, makes them unsuitable for large-scale foundation models. Motivated by the presence of a large range of values in them,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image and Signal Denoising Methods · Sparse and Compressive Sensing Techniques
MethodsDiffusion · Convolution · Contrastive Language-Image Pre-training
