How Noise Benefits AI-generated Image Detection
Ziqiang Li, Jiazhen Yan, Fan Wang, Kai Zeng, Zhangjie Fu

TL;DR
This paper introduces PiN-CLIP, a novel noise-based training method that enhances the robustness and generalization of AI-generated image detection by suppressing shortcuts and amplifying stable forensic cues.
Contribution
It proposes a controllable positive-incentive noise technique that improves detection accuracy across diverse generative models, advancing out-of-distribution generalization.
Findings
Achieved a 5.4% improvement in average accuracy over existing methods.
Constructed feature-space noise via cross-attention to suppress shortcut-sensitive directions.
Demonstrated state-of-the-art performance on a dataset with 42 different generative models.
Abstract
The rapid advancement of generative models has made real and synthetic images increasingly indistinguishable. Although extensive efforts have been devoted to detecting AI-generated images, out-of-distribution generalization remains a persistent challenge. We trace this weakness to spurious shortcuts exploited during training and we also observe that small feature-space perturbations can mitigate shortcut dominance. To address this problem in a more controllable manner, we propose the Positive-Incentive Noise for CLIP (PiN-CLIP), which jointly trains a noise generator and a detection network under a variational positive-incentive principle. Specifically, we construct positive-incentive noise in the feature space via cross-attention fusion of visual and categorical semantic features. During optimization, the noise is injected into the feature space to fine-tune the visual encoder,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
