Investigating the Impact of Speech Enhancement on Audio Deepfake Detection in Noisy Environments
Anacin, Angela, Shruti Kshirsagar, Anderson R. Avila

TL;DR
This study examines how speech enhancement algorithms affect the performance of audio deepfake detection in noisy environments, revealing that higher speech quality does not always translate to better spoofing detection accuracy.
Contribution
It provides an empirical analysis of the impact of different speech enhancement methods on audio deepfake detection performance under noisy conditions.
Findings
MetricGAN+ improves speech quality but worsens detection EER
SEGAN lowers speech quality scores but enhances detection performance
Enhancement artifacts can negatively influence spoofing detection accuracy
Abstract
Logical Access (LA) attacks, also known as audio deepfake attacks, use Text-to-Speech (TTS) or Voice Conversion (VC) methods to generate spoofed speech data. This can represent a serious threat to Automatic Speaker Verification (ASV) systems, as intruders can use such attacks to bypass voice biometric security. In this study, we investigate the correlation between speech quality and the performance of audio spoofing detection systems (i.e., LA task). For that, the performance of two enhancement algorithms is evaluated based on two perceptual speech quality measures, namely Perceptual Evaluation of Speech Quality (PESQ) and Speech-to-Reverberation Modulation Ratio (SRMR), and in respect to their impact on the audio spoofing detection system. We adopted the LA dataset, provided in the ASVspoof 2019 Challenge, and corrupted its test set with different Signal-to-Noise Ratio (SNR) levels,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders
