AI-Powered Deepfake Detection Using CNN and Vision Transformer Architectures
Sifatullah Sheikh Urmi, Kirtonia Nuzath Tabassum Arthi, and Md Al-Imran

TL;DR
This paper evaluates four AI-based models, including CNNs and a Vision Transformer, for deepfake detection, demonstrating that data augmentation and model choice significantly impact accuracy and efficiency.
Contribution
It introduces a comparative analysis of CNN and Vision Transformer architectures for deepfake detection, highlighting the effectiveness of data augmentation techniques.
Findings
VFDNET with MobileNetV3 achieved highest accuracy
Data preprocessing improved model performance
Vision Transformer showed promising results in detection accuracy
Abstract
The increasing use of artificial intelligence generated deepfakes creates major challenges in maintaining digital authenticity. Four AI-based models, consisting of three CNNs and one Vision Transformer, were evaluated using large face image datasets. Data preprocessing and augmentation techniques improved model performance across different scenarios. VFDNET demonstrated superior accuracy with MobileNetV3, showing efficient performance, thereby demonstrating AI's capabilities for dependable deepfake detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications
