Towards Generalizable Deepfake Detection via Forgery-aware Audio-Visual Adaptation: A Variational Bayesian Approach
Fan Nie, Jiangqun Ni, Jian Zhang, Bin Zhang, Weizhe Zhang, Bin Li

TL;DR
This paper introduces FoVB, a novel variational Bayesian framework for multi-modal deepfake detection that models audio-visual correlations as latent variables, improving generalization and detection accuracy.
Contribution
It proposes a forgery-aware audio-visual adaptation method using variational Bayes to better capture cross-modal inconsistencies in deepfake detection.
Findings
Outperforms state-of-the-art methods on multiple benchmarks.
Effectively models audio-visual correlations as Gaussian latent variables.
Enhances detection accuracy and generalization across diverse deepfake datasets.
Abstract
The widespread application of AIGC contents has brought not only unprecedented opportunities, but also potential security concerns, e.g., audio-visual deepfakes. Therefore, it is of great importance to develop an effective and generalizable method for multi-modal deepfake detection. Typically, the audio-visual correlation learning could expose subtle cross-modal inconsistencies, e.g., audio-visual misalignment, which serve as crucial clues in deepfake detection. In this paper, we reformulate the correlation learning with variational Bayesian estimation, where audio-visual correlation is approximated as a Gaussian distributed latent variable, and thus develop a novel framework for deepfake detection, i.e., Forgery-aware Audio-Visual Adaptation with Variational Bayes (FoVB). Specifically, given the prior knowledge of pre-trained backbones, we adopt two core designs to estimate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Adversarial Robustness in Machine Learning
