Deepfake Detection that Generalizes Across Benchmarks
Andrii Yermakov, Jan Cech, Jiri Matas, Mario Fritz

TL;DR
This paper introduces GenD, a parameter-efficient method that fine-tunes pre-trained vision encoders to improve deepfake detection generalization across diverse datasets, outperforming complex models.
Contribution
It demonstrates that minimal adaptation of foundation models, specifically tuning only Layer Normalization parameters, achieves state-of-the-art generalization in deepfake detection.
Findings
Training on paired real-fake data from the same source improves generalization.
Detection difficulty on academic datasets has not increased over time.
Minimal parameter tuning can outperform complex models in cross-dataset performance.
Abstract
The generalization of deepfake detectors to unseen manipulation techniques remains a challenge for practical deployment. Although many approaches adapt foundation models by introducing significant architectural complexity, this work demonstrates that robust generalization is achievable through a parameter-efficient adaptation of one of the foundational pre-trained vision encoders. The proposed method, GenD, fine-tunes only the Layer Normalization parameters (0.03% of the total) and enhances generalization by enforcing a hyperspherical feature manifold using L2 normalization and metric learning on it. We conducted an extensive evaluation on 14 benchmark datasets spanning from 2019 to 2025. The proposed method achieves state-of-the-art performance, outperforming more complex, recent approaches in average cross-dataset AUROC. Our analysis yields two primary findings for the field: 1)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
