EvoGuard: An Extensible Agentic RL-based Framework for Practical and Evolving AI-Generated Image Detection
Chenyang Zhu, Maorong Wang, Jun Liu, Ching-Chun Chang, Isao Echizen

TL;DR
EvoGuard is an agentic framework that intelligently orchestrates multiple detectors for AI-generated image detection, achieving state-of-the-art accuracy while being adaptable, cost-effective, and capable of integrating new tools without retraining.
Contribution
It introduces a novel agentic, tool-coordinating framework for AIGI detection that leverages reinforcement learning and reflection, surpassing traditional methods in accuracy and extensibility.
Findings
Achieves state-of-the-art detection accuracy.
Effectively integrates heterogeneous detectors.
Reduces reliance on costly annotations.
Abstract
The rapid proliferation of AI-Generated Images (AIGIs) has introduced severe risks of misinformation, making AIGI detection a critical yet challenging task. While traditional detection paradigms mainly rely on low-level features, recent research increasingly focuses on leveraging the general understanding ability of Multimodal Large Language Models (MLLMs) to achieve better generalization, but still suffer from limited extensibility and expensive training data annotations. To better address complex and dynamic real-world environments, we propose EvoGuard, a novel agentic framework for AIGI detection. It encapsulates various state-of-the-art (SOTA) off-the-shelf MLLM and non-MLLM detectors as callable tools, and coordinates them through a capability-aware dynamic orchestration mechanism. Empowered by the agent's capacities for autonomous planning and reflection, it intelligently selects…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
