Diffusion Sampling Correction via Approximately 10 Parameters
Guangyi Wang, Wei Peng, Lijiang Li, Wenyu Chen, Yuren Cai, Songzhi Su

TL;DR
This paper introduces PCA-based Adaptive Search (PAS), a minimal-parameter method that significantly improves diffusion sampling efficiency, reducing parameters to about 10 and enhancing existing solvers with negligible additional costs.
Contribution
PAS is a novel PCA-based adaptive search strategy that optimizes diffusion sampling with minimal parameters and computational overhead.
Findings
PAS reduces FID from 15.69 to 4.37 on CIFAR10.
Only 12 parameters are needed for effective correction.
Training is completed in under a minute on a single GPU.
Abstract
While powerful for generation, Diffusion Probabilistic Models (DPMs) face slow sampling challenges, for which various distillation-based methods have been proposed. However, they typically require significant additional training costs and model parameter storage, limiting their practicality. In this work, we propose PCA-based Adaptive Search (PAS), which optimizes existing solvers for DPMs with minimal additional costs. Specifically, we first employ PCA to obtain a few basis vectors to span the high-dimensional sampling space, which enables us to learn just a set of coordinates to correct the sampling direction; furthermore, based on the observation that the cumulative truncation error exhibits an ``S"-shape, we design an adaptive search strategy that further enhances the sampling efficiency and reduces the number of stored parameters to approximately 10. Extensive experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Data Compression Techniques
MethodsPrincipal Components Analysis · Sparse Evolutionary Training · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
