TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection

Xinqi Xiong; Prakrut Patel; Qingyuan Fan; Amisha Wadhwa; Sarathy Selvam; Xiao Guo; Luchao Qi; Xiaoming Liu; Roni Sengupta

arXiv:2505.24866·cs.CV·January 21, 2026

TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection

Xinqi Xiong, Prakrut Patel, Qingyuan Fan, Amisha Wadhwa, Sarathy Selvam, Xiao Guo, Luchao Qi, Xiaoming Liu, Roni Sengupta

PDF

1 Datasets

TL;DR

TalkingHeadBench is a comprehensive benchmark and dataset for evaluating the robustness and generalization of deepfake talking-head detection methods against the latest generative models, addressing current limitations in existing benchmarks.

Contribution

It introduces a new multi-modal benchmark with diverse datasets and protocols to assess detection models' robustness and generalization to advanced deepfake generators.

Findings

01

Existing detectors show limited robustness to new generators.

02

Transformers outperform CNNs in detection accuracy.

03

Error analysis reveals common failure modes and biases.

Abstract

The rapid advancement of talking-head deepfake generation fueled by advanced generative models has elevated the realism of synthetic videos to a level that poses substantial risks in domains such as media, politics, and finance. However, current benchmarks for deepfake talking-head detection fail to reflect this progress, relying on outdated generators and offering limited insight into model robustness and generalization. We introduce TalkingHeadBench, a comprehensive multi-model multi-generator benchmark and curated dataset designed to evaluate the performance of state-of-the-art detectors on the most advanced generators. Our dataset includes deepfakes synthesized by leading academic and commercial models and features carefully constructed protocols to assess generalization under distribution shifts in identity and generator characteristics. We benchmark a diverse set of existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

luchaoqi/TalkingHeadBench
dataset· 3.4k dl
3.4k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSparse Evolutionary Training