Explainable AI-Generated Image Detection RewardBench
Michael Yang, Shijian Deng, William T. Doan, Kai Wang, Tianyu Yang, Harsh Singh, Yapeng Tian

TL;DR
This paper introduces RewardBench, a benchmark to evaluate how well multimodal large language models can judge explanations about AI-generated images, revealing a performance gap compared to humans.
Contribution
It presents the first benchmark for assessing MLLMs' ability to evaluate explanations for AI-generated images, highlighting current limitations and providing a new evaluation framework.
Findings
Best reward model scored 88.76% accuracy
Human agreement reaches 98.30%
Current models lag behind human judgment
Abstract
Conventional, classification-based AI-generated image detection methods cannot explain why an image is considered real or AI-generated in a way a human expert would, which reduces the trustworthiness and persuasiveness of these detection tools for real-world applications. Leveraging Multimodal Large Language Models (MLLMs) has recently become a trending solution to this issue. Further, to evaluate the quality of generated explanations, a common approach is to adopt an "MLLM as a judge" methodology to evaluate explanations generated by other MLLMs. However, how well those MLLMs perform when judging explanations for AI-generated image detection generated by themselves or other MLLMs has not been well studied. We therefore propose \textbf{XAIGID-RewardBench}, the first benchmark designed to evaluate the ability of current MLLMs to judge the quality of explanations about whether an image is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education
