Spoofing Detection Goes Noisy: An Analysis of Synthetic Speech Detection   in the Presence of Additive Noise

Cemal Hanilci; Tomi Kinnunen; Md Sahidullah; Aleksandr Sizov

arXiv:1603.03947·cs.SD·September 16, 2016·2 cites

Spoofing Detection Goes Noisy: An Analysis of Synthetic Speech Detection in the Presence of Additive Noise

Cemal Hanilci, Tomi Kinnunen, Md Sahidullah, Aleksandr Sizov

PDF

Open Access

TL;DR

This paper investigates the robustness of synthetic speech detection methods in noisy environments, revealing their vulnerability to noise and exploring feature combinations to improve detection accuracy.

Contribution

It provides a comprehensive analysis of state-of-the-art spoofing detectors under additive noise, highlighting their limitations and proposing fusion strategies for better performance.

Findings

01

All detectors fail at high SNRs in noisy conditions.

02

Speech enhancement does not improve detection.

03

Fusion of features enhances robustness and accuracy.

Abstract

Automatic speaker verification (ASV) technology is recently finding its way to end-user applications for secure access to personal data, smart services or physical facilities. Similar to other biometric technologies, speaker verification is vulnerable to spoofing attacks where an attacker masquerades as a particular target speaker via impersonation, replay, text-to-speech (TTS) or voice conversion (VC) techniques to gain illegitimate access to the system. We focus on TTS and VC that represent the most flexible, high-end spoofing attacks. Most of the prior studies on synthesized or converted speech detection report their findings using high-quality clean recordings. Meanwhile, the performance of spoofing detectors in the presence of additive noise, an important consideration in practical ASV implementations, remains largely unknown. To this end, we analyze the suitability of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing