Language Dependencies in Adversarial Attacks on Speech Recognition Systems
Karla Markert, Donika Mirdita, Konstantin B\"ottinger

TL;DR
This paper compares the vulnerability of English and German speech recognition systems to adversarial attacks, revealing significant language-dependent differences in susceptibility and computational effort required for successful attacks.
Contribution
It provides the first inter-language comparison of ASR adversarial attackability, highlighting language-dependent robustness characteristics.
Findings
English and German ASR systems differ significantly in attack susceptibility.
German requires more computational effort for successful adversarial attacks.
Results suggest language influences robustness of speech recognition systems.
Abstract
Automatic speech recognition (ASR) systems are ubiquitously present in our daily devices. They are vulnerable to adversarial attacks, where manipulated input samples fool the ASR system's recognition. While adversarial examples for various English ASR systems have already been analyzed, there exists no inter-language comparative vulnerability analysis. We compare the attackability of a German and an English ASR system, taking Deepspeech as an example. We investigate if one of the language models is more susceptible to manipulations than the other. The results of our experiments suggest statistically significant differences between English and German in terms of computational effort necessary for the successful generation of adversarial examples. This result encourages further research in language-dependent characteristics in the robustness analysis of ASR.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
