How Many Images Does It Take? Estimating Imitation Thresholds in Text-to-Image Models

Sahil Verma; Royi Rassin; Arnav Das; Gantavya Bhatt; Preethi Seshadri; Chirag Shah; Jeff Bilmes; Hannaneh Hajishirzi; Yanai Elazar

arXiv:2410.15002·cs.CV·January 7, 2026

How Many Images Does It Take? Estimating Imitation Thresholds in Text-to-Image Models

Sahil Verma, Royi Rassin, Arnav Das, Gantavya Bhatt, Preethi Seshadri, Chirag Shah, Jeff Bilmes, Hannaneh Hajishirzi, Yanai Elazar

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to estimate the number of images needed for text-to-image models to imitate specific concepts, aiding in copyright and privacy law compliance.

Contribution

It proposes an efficient approach to determine imitation thresholds in text-to-image models without extensive retraining.

Findings

01

Imitation thresholds range from 200 to 700 images.

02

Threshold varies by domain and model.

03

Provides empirical basis for copyright considerations.

Abstract

Text-to-image models are trained using large datasets of image-text pairs collected from the internet. These datasets often include copyrighted and private images. Training models on such datasets enables them to generate images that might violate copyright laws and individual privacy. This phenomenon is termed imitation -- generation of images with content that has recognizable similarity to its training images. In this work we estimate the point at which a model was trained on enough instances of a concept to be able to imitate it -- the imitation threshold. We posit this question as a new problem and propose an efficient approach that estimates the imitation threshold without incurring the colossal cost of training these models from scratch. We experiment with two domains -- human faces and art styles, and evaluate four text-to-image models that were trained on three pretraining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vsahil/mimetic-2
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeurology and Historical Studies · Empathy and Medical Education