A Multi-Stream Fusion Approach with One-Class Learning for Audio-Visual Deepfake Detection
Kyungbok Lee, You Zhang, Zhiyao Duan

TL;DR
This paper introduces a multi-stream fusion method with one-class learning for audio-visual deepfake detection, enhancing generalization to unseen fakes and providing interpretability of cues indicating fakery.
Contribution
It proposes a novel multi-stream fusion approach combined with one-class learning for improved generalization and interpretability in audio-visual deepfake detection.
Findings
Outperforms previous models significantly on the new benchmark
Demonstrates strong generalization to unseen fake types
Provides interpretability by identifying key modalities for fakery detection
Abstract
This paper addresses the challenge of developing a robust audio-visual deepfake detection model. In practical use cases, new generation algorithms are continually emerging, and these algorithms are not encountered during the development of detection methods. This calls for the generalization ability of the method. Additionally, to ensure the credibility of detection methods, it is beneficial for the model to interpret which cues from the video indicate it is fake. Motivated by these considerations, we then propose a multi-stream fusion approach with one-class learning as a representation-level regularization technique. We study the generalization problem of audio-visual deepfake detection by creating a new benchmark by extending and re-splitting the existing FakeAVCeleb dataset. The benchmark contains four categories of fake videos (Real Audio-Fake Visual, Fake Audio-Fake Visual, Fake…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Speech and Audio Processing · Anomaly Detection Techniques and Applications
