Herd Mentality in Augmentation -- Not a Good Idea! A Robust Multi-stage Approach towards Deepfake Detection
Monu, Rohan Raju Dhanakshirur

TL;DR
This paper introduces a robust multi-stage deepfake detection approach using an enhanced GenConViT architecture with specialized training techniques, significantly improving detection accuracy on the Celeb-DF v2 dataset.
Contribution
The paper presents a novel multi-stage deepfake detection model that explicitly focuses on artefacts, incorporating weighted loss, update augmentation, and masked eye pretraining for improved accuracy.
Findings
F1 score improved by 1.71% on Celeb-DF v2
Accuracy increased by 4.34% on Celeb-DF v2
Proposed method outperforms standard classifiers in deepfake detection
Abstract
The rapid increase in deepfake technology has raised significant concerns about digital media integrity. Detecting deepfakes is crucial for safeguarding digital media. However, most standard image classifiers fail to distinguish between fake and real faces. Our analysis reveals that this failure is due to the model's inability to explicitly focus on the artefacts typically in deepfakes. We propose an enhanced architecture based on the GenConViT model, which incorporates weighted loss and update augmentation techniques and includes masked eye pretraining. This proposed model improves the F1 score by 1.71% and the accuracy by 4.34% on the Celeb-DF v2 dataset. The source code for our model is available at https://github.com/Monu-Khicher-1/multi-stage-learning
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Anomaly Detection Techniques and Applications
MethodsFocus
