Improving Interpretability and Robustness for the Detection of   AI-Generated Images

Tatiana Gaintseva; Laida Kushnareva; German Magai; Irina; Piontkovskaya; Sergey Nikolenko; Martin Benning; Serguei Barannikov; Gregory; Slabaugh

arXiv:2406.15035·cs.CV·June 24, 2024

Improving Interpretability and Robustness for the Detection of AI-Generated Images

Tatiana Gaintseva, Laida Kushnareva, German Magai, Irina, Piontkovskaya, Sergey Nikolenko, Martin Benning, Serguei Barannikov, Gregory, Slabaugh

PDF

Open Access

TL;DR

This paper enhances the robustness of AI-generated image detection by analyzing existing methods, proposing embedding and attention head improvements, and introducing a new dataset for better generalization across models.

Contribution

It introduces novel methods to improve detection robustness and provides a new dataset to facilitate future research in AI-generated image detection.

Findings

01

Increased out-of-distribution classification scores by up to 6%.

02

Analysis of CLIP-based detection methods reveals interpretability insights.

03

New dataset supports cross-model generalization evaluation.

Abstract

With growing abilities of generative models, artificial content detection becomes an increasingly important and difficult task. However, all popular approaches to this problem suffer from poor generalization across domains and generative models. In this work, we focus on the robustness of AI-generated image (AIGI) detectors. We analyze existing state-of-the-art AIGI detection methods based on frozen CLIP embeddings and show how to interpret them, shedding light on how images produced by various AI generators differ from real ones. Next we propose two ways to improve robustness: based on removing harmful components of the embedding vector and based on selecting the best performing attention heads in the image encoder model. Our methods increase the mean out-of-distribution (OOD) classification score by up to 6% for cross-model transfer. We also propose a new dataset for AIGI detection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)

MethodsSoftmax · Attention Is All You Need · Contrastive Language-Image Pre-training · Focus