Multimodal Conditional Information Bottleneck for Generalizable AI-Generated Image Detection

Haotian Qin; Dongliang Chang; Yueying Gao; Bingyao Yu; Lei Chen; Zhanyu Ma

arXiv:2505.15217·cs.CV·May 22, 2025

Multimodal Conditional Information Bottleneck for Generalizable AI-Generated Image Detection

Haotian Qin, Dongliang Chang, Yueying Gao, Bingyao Yu, Lei Chen, Zhanyu Ma

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multimodal conditional information bottleneck framework that enhances the generalization of AI-generated image detection by reducing feature redundancy and leveraging text guidance.

Contribution

It proposes a novel multimodal conditional bottleneck network with dynamic text orthogonalization to improve detection of AI-generated images across diverse models.

Findings

01

Achieves superior generalization on the GenImage dataset

02

Effectively reduces feature redundancy in CLIP-based detection

03

Outperforms existing methods in detecting images from latest generative models

Abstract

Although existing CLIP-based methods for detecting AI-generated images have achieved promising results, they are still limited by severe feature redundancy, which hinders their generalization ability. To address this issue, incorporating an information bottleneck network into the task presents a straightforward solution. However, relying solely on image-corresponding prompts results in suboptimal performance due to the inherent diversity of prompts. In this paper, we propose a multimodal conditional bottleneck network to reduce feature redundancy while enhancing the discriminative power of features extracted by CLIP, thereby improving the model's generalization ability. We begin with a semantic analysis experiment, where we observe that arbitrary text features exhibit lower cosine similarity with real image features than with fake image features in the CLIP feature space, a phenomenon…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ant0ny44/infofd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Brain Tumor Detection and Classification

MethodsContrastive Language-Image Pre-training