Towards the Development of a Real-Time Deepfake Audio Detection System   in Communication Platforms

Jonat John Mathew; Rakin Ahsan; Sae Furukawa; Jagdish Gautham Krishna; Kumar; Huzaifa Pallan; Agamjeet Singh Padda; Sara Adamski; Madhu Reddiboina,; Arjun Pankajakshan

arXiv:2403.11778·cs.SD·March 19, 2024·3 cites

Towards the Development of a Real-Time Deepfake Audio Detection System in Communication Platforms

Jonat John Mathew, Rakin Ahsan, Sae Furukawa, Jagdish Gautham Krishna, Kumar, Huzaifa Pallan, Agamjeet Singh Padda, Sara Adamski, Madhu Reddiboina,, Arjun Pankajakshan

PDF

Open Access

TL;DR

This paper explores the feasibility of deploying static deepfake audio detection models in real-time communication platforms, developing cross-platform software and evaluating models based on Resnet and LCNN architectures for timely detection.

Contribution

It introduces a practical framework for real-time deepfake audio detection using existing models and demonstrates their effectiveness on benchmark datasets, advancing audio security in communication.

Findings

01

Resnet and LCNN models achieve benchmark performance

02

The software is cross-platform and capable of real-time detection

03

Strategies for enhancing detection models are proposed

Abstract

Deepfake audio poses a rising threat in communication platforms, necessitating real-time detection for audio stream integrity. Unlike traditional non-real-time approaches, this study assesses the viability of employing static deepfake audio detection models in real-time communication platforms. An executable software is developed for cross-platform compatibility, enabling real-time execution. Two deepfake audio detection models based on Resnet and LCNN architectures are implemented using the ASVspoof 2019 dataset, achieving benchmark performances compared to ASVspoof 2019 challenge baselines. The study proposes strategies and frameworks for enhancing these models, paving the way for real-time deepfake audio detection in communication platforms. This work contributes to the advancement of audio stream security, ensuring robust detection capabilities in dynamic, real-time communication…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing

MethodsAverage Pooling · Max Pooling · Kaiming Initialization · Global Average Pooling · Convolution