Crafting Adversarial Examples For Speech Paralinguistics Applications
Yuan Gong, Christian Poellabauer

TL;DR
This paper introduces a method to generate adversarial audio examples that significantly degrade the performance of speech paralinguistic neural networks with minimal audio quality loss.
Contribution
It presents an end-to-end approach to create adversarial examples by perturbing raw audio waveforms, highlighting vulnerabilities in speech analysis systems.
Findings
Adversarial perturbations cause substantial performance drops in neural networks.
Minimal audio quality degradation is achieved despite effective attacks.
The method demonstrates potential security risks in speech-based applications.
Abstract
Computational paralinguistic analysis is increasingly being used in a wide range of cyber applications, including security-sensitive applications such as speaker verification, deceptive speech detection, and medical diagnostics. While state-of-the-art machine learning techniques, such as deep neural networks, can provide robust and accurate speech analysis, they are susceptible to adversarial attacks. In this work, we propose an end-to-end scheme to generate adversarial examples for computational paralinguistic applications by perturbing directly the raw waveform of an audio recording rather than specific acoustic features. Our experiments show that the proposed adversarial perturbation can lead to a significant performance drop of state-of-the-art deep neural networks, while only minimally impairing the audio quality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Hate Speech and Cyberbullying Detection · Speech Recognition and Synthesis
