RAIDX: A Retrieval-Augmented Generation and GRPO Reinforcement Learning Framework for Explainable Deepfake Detection

Tianxiao Li; Zhenglin Huang; Haiquan Wen; Yiwei He; Shuchang Lyu; Baoyuan Wu; and Guangliang Cheng

arXiv:2508.04524·cs.CV·August 7, 2025

RAIDX: A Retrieval-Augmented Generation and GRPO Reinforcement Learning Framework for Explainable Deepfake Detection

Tianxiao Li, Zhenglin Huang, Haiquan Wen, Yiwei He, Shuchang Lyu, Baoyuan Wu, and Guangliang Cheng

PDF

TL;DR

RAIDX is a novel framework combining retrieval-augmented generation and reinforcement learning to improve deepfake detection accuracy and provide interpretable explanations, addressing transparency and annotation challenges.

Contribution

Introduces RAIDX, the first unified framework integrating RAG and GRPO for enhanced deepfake detection and explainability without extensive manual annotations.

Findings

01

Achieves state-of-the-art detection accuracy on multiple benchmarks.

02

Provides fine-grained textual explanations and saliency maps.

03

Enhances transparency and interpretability in deepfake detection.

Abstract

The rapid advancement of AI-generation models has enabled the creation of hyperrealistic imagery, posing ethical risks through widespread misinformation. Current deepfake detection methods, categorized as face specific detectors or general AI-generated detectors, lack transparency by framing detection as a classification task without explaining decisions. While several LLM-based approaches offer explainability, they suffer from coarse-grained analyses and dependency on labor-intensive annotations. This paper introduces RAIDX (Retrieval-Augmented Image Deepfake Detection and Explainability), a novel deepfake detection framework integrating Retrieval-Augmented Generation (RAG) and Group Relative Policy Optimization (GRPO) to enhance detection accuracy and decision explainability. Specifically, RAIDX leverages RAG to incorporate external knowledge for improved detection accuracy and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.