AVFakeBench: A Comprehensive Audio-Video Forgery Detection Benchmark for AV-LMMs

Shuhan Xia; Peipei Li; Xuannan Liu; Dongsen Zhang; Xinyu Guo; Zekun Li

arXiv:2511.21251·cs.CV·March 16, 2026

AVFakeBench: A Comprehensive Audio-Video Forgery Detection Benchmark for AV-LMMs

Shuhan Xia, Peipei Li, Xuannan Liu, Dongsen Zhang, Xinyu Guo, Zekun Li

PDF

Open Access

TL;DR

AVFakeBench is a new comprehensive benchmark for audio-video forgery detection that covers diverse forgery types and annotations, aiming to improve detection methods and evaluate large language models' capabilities.

Contribution

This paper introduces AVFakeBench, the first extensive benchmark for AV forgery detection with rich semantics and multi-task evaluation, addressing limitations of previous datasets.

Findings

01

AV-LMMs show potential as forgery detectors.

02

Existing methods struggle with fine-grained perception.

03

Benchmark reveals weaknesses in current detection approaches.

Abstract

The threat of Audio-Video (AV) forgery is rapidly evolving beyond human-centric deepfakes to include more diverse manipulations across complex natural scenes. However, existing benchmarks are still confined to DeepFake-based forgeries and single-granularity annotations, thus failing to capture the diversity and complexity of real-world forgery scenarios. To address this, we introduce AVFakeBench, the first comprehensive audio-video forgery detection benchmark that spans rich forgery semantics across both human subject and general subject. AVFakeBench comprises 12K carefully curated audio-video questions, covering seven forgery types and four levels of annotations. To ensure high-quality and diverse forgeries, we propose a multi-stage hybrid forgery framework that integrates proprietary models for task planning with expert generative models for precise manipulation. The benchmark…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning