Scaling Laws of Synthetic Images for Model Training ... for Now
Lijie Fan, Kaifeng Chen, Dilip Krishnan, Dina Katabi, Phillip Isola,, Yonglong Tian

TL;DR
This paper investigates how synthetic images generated by state-of-the-art text-to-image models scale for training vision systems, revealing their limitations and potential benefits in specific scenarios such as limited data or out-of-distribution tasks.
Contribution
It provides a detailed analysis of the scaling laws of synthetic images for training classifiers and CLIP models, highlighting factors affecting performance and identifying scenarios where synthetic data is most beneficial.
Findings
Synthetic images follow a scaling trend similar to real images in CLIP training.
Synthetic images underperform in scaling for supervised image classifiers.
Limitations of current text-to-image models hinder their effectiveness for classifier training.
Abstract
Recent significant advances in text-to-image models unlock the possibility of training vision systems using synthetic images, potentially overcoming the difficulty of collecting curated data at scale. It is unclear, however, how these models behave at scale, as more synthetic data is added to the training set. In this paper we study the scaling laws of synthetic images generated by state of the art text-to-image models, for the training of supervised models: image classifiers with label supervision, and CLIP with language supervision. We identify several factors, including text prompts, classifier-free guidance scale, and types of text-to-image models, that significantly affect scaling behavior. After tuning these factors, we observe that synthetic images demonstrate a scaling trend similar to, but slightly less effective than, real images in CLIP training, while they significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsContrastive Language-Image Pre-training
