DiffNAS: Bootstrapping Diffusion Models by Prompting for Better   Architectures

Wenhao Li; Xiu Su; Shan You; Fei Wang; Chen Qian; Chang Xu

arXiv:2310.04750·cs.AI·October 11, 2023

DiffNAS: Bootstrapping Diffusion Models by Prompting for Better Architectures

Wenhao Li, Xiu Su, Shan You, Fei Wang, Chen Qian, Chang Xu

PDF

Open Access

TL;DR

This paper introduces DiffNAS, a novel approach for optimizing diffusion model architectures using GPT-4 for rapid search and RFID for quick ranking, resulting in improved efficiency and performance on CIFAR10.

Contribution

DiffNAS is the first to leverage GPT-4 as a supernet for diffusion model architecture search, significantly enhancing search speed and model performance.

Findings

01

Search efficiency doubled with GPT-4 based search.

02

Achieved 2.82 FID on CIFAR10, outperforming IDDPM.

03

Rapid convergence training improved overall process.

Abstract

Diffusion models have recently exhibited remarkable performance on synthetic data. After a diffusion path is selected, a base model, such as UNet, operates as a denoising autoencoder, primarily predicting noises that need to be eliminated step by step. Consequently, it is crucial to employ a model that aligns with the expected budgets to facilitate superior synthetic performance. In this paper, we meticulously analyze the diffusion model and engineer a base model search approach, denoted "DiffNAS". Specifically, we leverage GPT-4 as a supernet to expedite the search, supplemented with a search memory to enhance the results. Moreover, we employ RFID as a proxy to promptly rank the experimental outcomes produced by GPT-4. We also adopt a rapid-convergence training strategy to boost search efficiency. Rigorous experimentation corroborates that our algorithm can augment the search…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Model Reduction and Neural Networks

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Balanced Selection · Dense Connections · Label Smoothing · Adam · Diffusion