SynthForensics: Benchmarking and Evaluating People-Centric Synthetic Video Deepfakes

Roberto Leotta; Salvatore Alfio Sambataro; Claudio Vittorio Ragaglia; Mirko Casu; Yuri Petralia; Francesco Guarnera; Luca Guarnera; Sebastiano Battiato

arXiv:2602.04939·cs.CV·May 11, 2026

SynthForensics: Benchmarking and Evaluating People-Centric Synthetic Video Deepfakes

Roberto Leotta, Salvatore Alfio Sambataro, Claudio Vittorio Ragaglia, Mirko Casu, Yuri Petralia, Francesco Guarnera, Luca Guarnera, Sebastiano Battiato

PDF

1 Repo

TL;DR

SynthForensics is a comprehensive benchmark dataset of synthetic videos that improves evaluation of deepfake detection by focusing on realistic human depiction and human-validated comparisons.

Contribution

It introduces a new benchmark dataset with paired-source videos, human validation, and multiple compression levels, addressing limitations of existing synthetic-video evaluation methods.

Findings

01

Raters prefer SynthForensics in 71-77% of comparisons over existing benchmarks.

02

Face-based detection methods' performance drops significantly on SynthForensics.

03

Fine-tuning detectors reduces performance gap but at a cost to legacy benchmarks.

Abstract

Modern T2V/I2V generators synthesize people increasingly hard to distinguish from authentic footage, while current evaluation suites lag: legacy benchmarks target manipulation-based forgeries, and recent synthetic-video benchmarks prioritize scale over realistic human depiction. We introduce SynthForensics, a people-centric benchmark of $20, 445$ videos from 8 T2V and 7 I2V open-source generators, paired-source from FF++/DFD reals, two-stage human-validated, in four compression versions with full metadata. In our paired-comparison human study, raters prefer SynthForensics in $71$ -- $77%$ of head-to-head comparisons against each of nine existing synthetic-video benchmarks, while facial-quality metrics fall within the FF++/DFD baseline range. Across 15 detectors and three protocols, face-based methods drop $13$ -- $55$ AUC points (mean $27$ ) from FF++ to SynthForensics and a further $23$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.