NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic   Adversarial Training

Dar-Yen Chen; Hmrishav Bandyopadhyay; Kai Zou; Yi-Zhe Song

arXiv:2412.02030·cs.CV·December 9, 2024

NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training

Dar-Yen Chen, Hmrishav Bandyopadhyay, Kai Zou, Yi-Zhe Song

PDF

Open Access 1 Models

TL;DR

NitroFusion introduces a dynamic adversarial training framework with specialized discriminator pools and multi-scale assessment to achieve high-fidelity single-step diffusion, enabling flexible quality-speed trade-offs.

Contribution

The paper presents a novel dynamic adversarial approach with specialized discriminators and flexible deployment for improved single-step diffusion quality.

Findings

01

Outperforms existing single-step methods in quality metrics

02

Preserves fine details and global consistency effectively

03

Supports dynamic quality-speed trade-offs with 1-4 denoising steps

Abstract

We introduce NitroFusion, a fundamentally different approach to single-step diffusion that achieves high-quality generation through a dynamic adversarial framework. While one-step methods offer dramatic speed advantages, they typically suffer from quality degradation compared to their multi-step counterparts. Just as a panel of art critics provides comprehensive feedback by specializing in different aspects like composition, color, and technique, our approach maintains a large pool of specialized discriminator heads that collectively guide the generation process. Each discriminator group develops expertise in specific quality aspects at different noise levels, providing diverse feedback that enables high-fidelity one-step generation. Our framework combines: (i) a dynamic discriminator pool with specialized discriminator groups to improve generation quality, (ii) strategic refresh…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
ChenDY/NitroFusion
model· 70 dl· ♡ 98
70 dl♡ 98

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpectroscopy Techniques in Biomedical and Chemical Research

MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings