Training on Thin Air: Improve Image Classification with Generated Data

Yongchao Zhou; Hshmat Sahak; Jimmy Ba

arXiv:2305.15316·cs.CV·May 25, 2023·6 cites

Training on Thin Air: Improve Image Classification with Generated Data

Yongchao Zhou, Hshmat Sahak, Jimmy Ba

PDF

Open Access 1 Repo

TL;DR

This paper introduces Diffusion Inversion, a method using Stable Diffusion to generate high-quality, diverse training images that significantly improve image classification performance and reduce sampling time.

Contribution

We propose Diffusion Inversion, a novel technique that leverages pre-trained generative models to create effective training data, outperforming existing methods and enhancing various neural architectures.

Findings

01

2-3x increase in sample efficiency

02

6.5x reduction in sampling time

03

Consistent performance improvements across datasets

Abstract

Acquiring high-quality data for training discriminative models is a crucial yet challenging aspect of building effective predictive systems. In this paper, we present Diffusion Inversion, a simple yet effective method that leverages the pre-trained generative model, Stable Diffusion, to generate diverse, high-quality training data for image classification. Our approach captures the original data distribution and ensures data coverage by inverting images to the latent space of Stable Diffusion, and generates diverse novel training images by conditioning the generative model on noisy versions of these vectors. We identify three key components that allow our generated images to successfully supplant the original dataset, leading to a 2-3x enhancement in sample complexity and a 6.5x decrease in sampling time. Moreover, our approach consistently outperforms generic prompt-based steering…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yongchao97/diffusion_inversion
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computational Physics and Python Applications · AI in cancer detection

MethodsDiffusion