Gumbel Rao Monte Carlo based Bi-Modal Neural Architecture Search for Audio-Visual Deepfake Detection
Aravinda Reddy PN, Raghavendra Ramachandra, Krothapalli Sreenivasa, Rao, Pabitra Mitra Vinod Rathod

TL;DR
This paper introduces GRMC-BMNAS, a novel neural architecture search framework using Gumbel-Rao Monte Carlo sampling to optimize multimodal fusion for deepfake detection, achieving high accuracy with minimal parameters.
Contribution
It proposes a new architecture search method that refines Gumbel Softmax with Rao-Blackwellization, improving stability and performance in multimodal deepfake detection.
Findings
Achieved 95.4% AUC on FakeAVCeleb and SWAN-DF datasets.
Efficiently identifies crucial features from backbone networks.
Optimizes architecture for better generalization and classification performance.
Abstract
Deepfakes pose a critical threat to biometric authentication systems by generating highly realistic synthetic media. Existing multimodal deepfake detectors often struggle to adapt to diverse data and rely on simple fusion methods. To address these challenges, we propose Gumbel-Rao Monte Carlo Bi-modal Neural Architecture Search (GRMC-BMNAS), a novel architecture search framework that employs Gumbel-Rao Monte Carlo sampling to optimize multimodal fusion. It refines the Straight through Gumbel Softmax (STGS) method by reducing variance with Rao-Blackwellization, stabilizing network training. Using a two-level search approach, the framework optimizes the network architecture, parameters, and performance. Crucial features are efficiently identified from backbone networks, while within the cell structure, a weighted fusion operation integrates information from various sources. By varying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Image and Signal Denoising Methods · Generative Adversarial Networks and Image Synthesis
