Hear No Evil: Towards Adversarial Robustness of Automatic Speech Recognition via Multi-Task Learning
Nilaksh Das, Duen Horng Chau

TL;DR
This paper explores how multi-task learning with diverse tasks enhances the adversarial robustness of automatic speech recognition systems, demonstrating significant improvements over single-task models.
Contribution
It is the first comprehensive study showing that multi-task learning improves adversarial robustness in speech recognition models, with detailed analysis and practical remedies.
Findings
MTL with diverse tasks increases adversarial robustness.
Significant reduction in adversarially targeted WER (17.25 to 59.90).
Identifies pitfalls and remedies affecting robustness.
Abstract
As automatic speech recognition (ASR) systems are now being widely deployed in the wild, the increasing threat of adversarial attacks raises serious questions about the security and reliability of using such systems. On the other hand, multi-task learning (MTL) has shown success in training models that can resist adversarial attacks in the computer vision domain. In this work, we investigate the impact of performing such multi-task learning on the adversarial robustness of ASR models in the speech domain. We conduct extensive MTL experimentation by combining semantically diverse tasks such as accent classification and ASR, and evaluate a wide range of adversarial settings. Our thorough analysis reveals that performing MTL with semantically diverse tasks consistently makes it harder for an adversarial attack to succeed. We also discuss in detail the serious pitfalls and their related…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
