Is synthetic data from generative models ready for image recognition?

Ruifei He; Shuyang Sun; Xin Yu; Chuhui Xue; Wenqing Zhang; Philip; Torr; Song Bai; Xiaojuan Qi

arXiv:2210.07574·cs.CV·February 16, 2023·56 cites

Is synthetic data from generative models ready for image recognition?

Ruifei He, Shuyang Sun, Xin Yu, Chuhui Xue, Wenqing Zhang, Philip, Torr, Song Bai, Xiaojuan Qi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper evaluates the effectiveness of synthetic images generated by state-of-the-art text-to-image models for image recognition, focusing on data augmentation in scarce data scenarios and pre-training for transfer learning.

Contribution

It provides an extensive analysis of synthetic data's utility for recognition tasks and proposes strategies to enhance their application in classification and pre-training.

Findings

01

Synthetic data can improve classification in data-scarce settings.

02

Synthetic images have limitations in recognition tasks.

03

Strategies can enhance synthetic data effectiveness.

Abstract

Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images. Though the results are astonishing to human eyes, how applicable these generated images are for recognition tasks remains under-explored. In this work, we extensively study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks, and focus on two perspectives: synthetic data for improving classification models in data-scarce settings (i.e. zero-shot and few-shot), and synthetic data for large-scale model pre-training for transfer learning. We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks. Code: https://github.com/CVMI-Lab/SyntheticData.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cvmi-lab/syntheticdata
pytorchOfficial

Videos

IS SYNTHETIC DATA FROM GENERATIVE MODELS READY FOR IMAGE RECOGNITION?· slideslive

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Computational Physics and Python Applications