Detecting AI-Generated Images via Distributional Deviations from Real Images
Yakun Niu, Yingjian Chen, Lei Zhang

TL;DR
This paper introduces a novel fine-tuning strategy for CLIP to improve the detection of AI-generated images by focusing on distributional deviations, achieving high accuracy and better generalization.
Contribution
It proposes a Texture-Aware Masking fine-tuning method that enhances CLIP's ability to distinguish real from AI-generated images by emphasizing distributional deviations.
Findings
Achieves up to 98.2% accuracy on GenImage dataset
Outperforms existing methods with minimal training data
Enhances generalization to unseen generative models
Abstract
The rapid advancement of generative models has significantly enhanced the quality of AI-generated images, raising concerns about misinformation and the erosion of public trust. Detecting AI-generated images has thus become a critical challenge, particularly in terms of generalizing to unseen generative models. Existing methods using frozen pre-trained CLIP models show promise in generalization but treat the image encoder as a basic feature extractor, failing to fully exploit its potential. In this paper, we perform an in-depth analysis of the frozen CLIP image encoder (CLIP-ViT), revealing that it effectively clusters real images in a high-level, abstract feature space. However, it does not truly possess the ability to distinguish between real and AI-generated images. Based on this analysis, we propose a Masking-based Pre-trained model Fine-Tuning (MPFT) strategy, which introduces a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Face recognition and analysis
