Detecting AI-Generated Images via Distributional Deviations from Real Images

Yakun Niu; Yingjian Chen; Lei Zhang

arXiv:2601.03586·cs.CV·January 8, 2026

Detecting AI-Generated Images via Distributional Deviations from Real Images

Yakun Niu, Yingjian Chen, Lei Zhang

PDF

Open Access

TL;DR

This paper introduces a novel fine-tuning strategy for CLIP to improve the detection of AI-generated images by focusing on distributional deviations, achieving high accuracy and better generalization.

Contribution

It proposes a Texture-Aware Masking fine-tuning method that enhances CLIP's ability to distinguish real from AI-generated images by emphasizing distributional deviations.

Findings

01

Achieves up to 98.2% accuracy on GenImage dataset

02

Outperforms existing methods with minimal training data

03

Enhances generalization to unseen generative models

Abstract

The rapid advancement of generative models has significantly enhanced the quality of AI-generated images, raising concerns about misinformation and the erosion of public trust. Detecting AI-generated images has thus become a critical challenge, particularly in terms of generalizing to unseen generative models. Existing methods using frozen pre-trained CLIP models show promise in generalization but treat the image encoder as a basic feature extractor, failing to fully exploit its potential. In this paper, we perform an in-depth analysis of the frozen CLIP image encoder (CLIP-ViT), revealing that it effectively clusters real images in a high-level, abstract feature space. However, it does not truly possess the ability to distinguish between real and AI-generated images. Based on this analysis, we propose a Masking-based Pre-trained model Fine-Tuning (MPFT) strategy, which introduces a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Face recognition and analysis