Cross-Resolution Diffusion Models via Network Pruning
Jiaxuan Ren, Junhan Zhu, Huan Wang

TL;DR
CR-Diff is a pruning-based method that enhances cross-resolution image synthesis in diffusion models by improving semantic consistency and stability across various resolutions without sacrificing default resolution performance.
Contribution
It introduces a novel two-stage pruning and amplification approach to improve cross-resolution performance in diffusion models, addressing resolution-dependent parameter issues.
Findings
CR-Diff improves perceptual fidelity across unseen resolutions.
It maintains performance at default resolutions.
Supports prompt-specific refinement for on-demand quality enhancement.
Abstract
Diffusion models have demonstrated impressive image synthesis performance, yet many UNet-based models are trained at certain fixed resolutions. Their quality tends to degrade when generating images at out-of-training resolutions. We trace this issue to resolution-dependent parameter behaviors, where weights that function well at the default resolution can become adverse when spatial scales shift, weakening semantic alignment and causing structural instability in the UNet architecture. Based on this analysis, this paper introduces CR-Diff, a novel method that improves the cross-resolution visual consistency by pruning some parameters of the diffusion model. Specifically, CR-Diff has two stages. It first performs block-wise pruning to selectively eliminate adverse weights. Then, a pruned output amplification is conducted to further purify the pruned predictions. Empirically, extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
