TL;DR
SynthForensics is a comprehensive benchmark dataset of synthetic videos that improves evaluation of deepfake detection by focusing on realistic human depiction and human-validated comparisons.
Contribution
It introduces a new benchmark dataset with paired-source videos, human validation, and multiple compression levels, addressing limitations of existing synthetic-video evaluation methods.
Findings
Raters prefer SynthForensics in 71-77% of comparisons over existing benchmarks.
Face-based detection methods' performance drops significantly on SynthForensics.
Fine-tuning detectors reduces performance gap but at a cost to legacy benchmarks.
Abstract
Modern T2V/I2V generators synthesize people increasingly hard to distinguish from authentic footage, while current evaluation suites lag: legacy benchmarks target manipulation-based forgeries, and recent synthetic-video benchmarks prioritize scale over realistic human depiction. We introduce SynthForensics, a people-centric benchmark of videos from 8 T2V and 7 I2V open-source generators, paired-source from FF++/DFD reals, two-stage human-validated, in four compression versions with full metadata. In our paired-comparison human study, raters prefer SynthForensics in -- of head-to-head comparisons against each of nine existing synthetic-video benchmarks, while facial-quality metrics fall within the FF++/DFD baseline range. Across 15 detectors and three protocols, face-based methods drop -- AUC points (mean ) from FF++ to SynthForensics and a further …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
