How Many Images Does It Take? Estimating Imitation Thresholds in Text-to-Image Models
Sahil Verma, Royi Rassin, Arnav Das, Gantavya Bhatt, Preethi Seshadri, Chirag Shah, Jeff Bilmes, Hannaneh Hajishirzi, Yanai Elazar

TL;DR
This paper introduces a method to estimate the number of images needed for text-to-image models to imitate specific concepts, aiding in copyright and privacy law compliance.
Contribution
It proposes an efficient approach to determine imitation thresholds in text-to-image models without extensive retraining.
Findings
Imitation thresholds range from 200 to 700 images.
Threshold varies by domain and model.
Provides empirical basis for copyright considerations.
Abstract
Text-to-image models are trained using large datasets of image-text pairs collected from the internet. These datasets often include copyrighted and private images. Training models on such datasets enables them to generate images that might violate copyright laws and individual privacy. This phenomenon is termed imitation -- generation of images with content that has recognizable similarity to its training images. In this work we estimate the point at which a model was trained on enough instances of a concept to be able to imitate it -- the imitation threshold. We posit this question as a new problem and propose an efficient approach that estimates the imitation threshold without incurring the colossal cost of training these models from scratch. We experiment with two domains -- human faces and art styles, and evaluate four text-to-image models that were trained on three pretraining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurology and Historical Studies · Empathy and Medical Education
