Detecting AI-Generated Images via CLIP
A.G. Moskowitz, T. Gaona, J. Peterson

TL;DR
This paper demonstrates that fine-tuned CLIP models can effectively detect AI-generated images and identify their generation methods, offering a resource-efficient alternative to specialized detection models.
Contribution
The study shows that pre-trained CLIP, when fine-tuned, can match or outperform specialized models in detecting AI-generated images without architecture modifications.
Findings
Fine-tuned CLIP detects AI-generated images effectively.
CLIP-based detection requires less GPU resources.
Method identifies the specific AI generation method used.
Abstract
As AI-generated image (AIGI) methods become more powerful and accessible, it has become a critical task to determine if an image is real or AI-generated. Because AIGI lack the signatures of photographs and have their own unique patterns, new models are needed to determine if an image is AI-generated. In this paper, we investigate the ability of the Contrastive Language-Image Pre-training (CLIP) architecture, pre-trained on massive internet-scale data sets, to perform this differentiation. We fine-tune CLIP on real images and AIGI from several generative models, enabling CLIP to determine if an image is AI-generated and, if so, determine what generation method was used to create it. We show that the fine-tuned CLIP architecture is able to differentiate AIGI as well or better than models whose architecture is specifically designed to detect AIGI. Our method will significantly increase…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · AI in cancer detection
MethodsContrastive Language-Image Pre-training
