Survey on AI-Generated Media Detection: From Non-MLLM to MLLM

Yueying Zou; Peipei Li; Zekun Li; Huaibo Huang; Xing Cui; Xuannan Liu,; Chenghanyu Zhang; Ran He

arXiv:2502.05240·cs.CV·February 13, 2025

Survey on AI-Generated Media Detection: From Non-MLLM to MLLM

Yueying Zou, Peipei Li, Zekun Li, Huaibo Huang, Xing Cui, Xuannan Liu,, Chenghanyu Zhang, Ran He

PDF

Open Access

TL;DR

This survey reviews the evolution of AI-generated media detection methods, comparing non-MLLM and MLLM approaches, analyzing their methodologies, challenges, and ethical considerations to guide future research.

Contribution

It provides the first comprehensive comparison of non-MLLM and MLLM-based detection methods, analyzing their differences, potential hybrids, and addressing ethical and regulatory issues.

Findings

01

MLLM-based detectors offer broader applicability and explainability.

02

Hybrid approaches show promise in improving detection accuracy.

03

Regulatory landscapes vary significantly across jurisdictions.

Abstract

The proliferation of AI-generated media poses significant challenges to information authenticity and social trust, making reliable detection methods highly demanded. Methods for detecting AI-generated media have evolved rapidly, paralleling the advancement of Multimodal Large Language Models (MLLMs). Current detection approaches can be categorized into two main groups: Non-MLLM-based and MLLM-based methods. The former employs high-precision, domain-specific detectors powered by deep learning techniques, while the latter utilizes general-purpose detectors based on MLLMs that integrate authenticity verification, explainability, and localization capabilities. Despite significant progress in this field, there remains a gap in literature regarding a comprehensive survey that examines the transition from domain-specific to general-purpose detection methods. This paper addresses this gap by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Generative Adversarial Networks and Image Synthesis