Investigating the Impact of Speech Enhancement on Audio Deepfake Detection in Noisy Environments

Anacin; Angela; Shruti Kshirsagar; Anderson R. Avila

arXiv:2603.14767·cs.SD·March 17, 2026

Investigating the Impact of Speech Enhancement on Audio Deepfake Detection in Noisy Environments

Anacin, Angela, Shruti Kshirsagar, Anderson R. Avila

PDF

Open Access

TL;DR

This study examines how speech enhancement algorithms affect the performance of audio deepfake detection in noisy environments, revealing that higher speech quality does not always translate to better spoofing detection accuracy.

Contribution

It provides an empirical analysis of the impact of different speech enhancement methods on audio deepfake detection performance under noisy conditions.

Findings

01

MetricGAN+ improves speech quality but worsens detection EER

02

SEGAN lowers speech quality scores but enhances detection performance

03

Enhancement artifacts can negatively influence spoofing detection accuracy

Abstract

Logical Access (LA) attacks, also known as audio deepfake attacks, use Text-to-Speech (TTS) or Voice Conversion (VC) methods to generate spoofed speech data. This can represent a serious threat to Automatic Speaker Verification (ASV) systems, as intruders can use such attacks to bypass voice biometric security. In this study, we investigate the correlation between speech quality and the performance of audio spoofing detection systems (i.e., LA task). For that, the performance of two enhancement algorithms is evaluated based on two perceptual speech quality measures, namely Perceptual Evaluation of Speech Quality (PESQ) and Speech-to-Reverberation Modulation Ratio (SRMR), and in respect to their impact on the audio spoofing detection system. We adopted the LA dataset, provided in the ASVspoof 2019 Challenge, and corrupted its test set with different Signal-to-Noise Ratio (SNR) levels,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders