Gumbel Rao Monte Carlo based Bi-Modal Neural Architecture Search for   Audio-Visual Deepfake Detection

Aravinda Reddy PN; Raghavendra Ramachandra; Krothapalli Sreenivasa; Rao; Pabitra Mitra Vinod Rathod

arXiv:2410.06543·cs.CR·October 10, 2024

Gumbel Rao Monte Carlo based Bi-Modal Neural Architecture Search for Audio-Visual Deepfake Detection

Aravinda Reddy PN, Raghavendra Ramachandra, Krothapalli Sreenivasa, Rao, Pabitra Mitra Vinod Rathod

PDF

Open Access

TL;DR

This paper introduces GRMC-BMNAS, a novel neural architecture search framework using Gumbel-Rao Monte Carlo sampling to optimize multimodal fusion for deepfake detection, achieving high accuracy with minimal parameters.

Contribution

It proposes a new architecture search method that refines Gumbel Softmax with Rao-Blackwellization, improving stability and performance in multimodal deepfake detection.

Findings

01

Achieved 95.4% AUC on FakeAVCeleb and SWAN-DF datasets.

02

Efficiently identifies crucial features from backbone networks.

03

Optimizes architecture for better generalization and classification performance.

Abstract

Deepfakes pose a critical threat to biometric authentication systems by generating highly realistic synthetic media. Existing multimodal deepfake detectors often struggle to adapt to diverse data and rely on simple fusion methods. To address these challenges, we propose Gumbel-Rao Monte Carlo Bi-modal Neural Architecture Search (GRMC-BMNAS), a novel architecture search framework that employs Gumbel-Rao Monte Carlo sampling to optimize multimodal fusion. It refines the Straight through Gumbel Softmax (STGS) method by reducing variance with Rao-Blackwellization, stabilizing network training. Using a two-level search approach, the framework optimizes the network architecture, parameters, and performance. Crucial features are efficiently identified from backbone networks, while within the cell structure, a weighted fusion operation integrates information from various sources. By varying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Image and Signal Denoising Methods · Generative Adversarial Networks and Image Synthesis