Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation
Kuan Po Huang, Yu-Kuan Fu, Yu Zhang, Hung-yi Lee

TL;DR
This paper proposes a domain adversarial training approach to improve the robustness of self-supervised speech processing models against various speech distortions, maintaining performance on clean speech and generalizing to unseen distortions.
Contribution
It introduces a domain adversarial training method for enhancing speech model robustness to distortions without sacrificing clean speech performance.
Findings
Improved robustness on distorted speech data.
Effective generalization to unseen distortions.
Maintained performance on clean speech.
Abstract
Speech distortions are a long-standing problem that degrades the performance of supervisely trained speech processing models. It is high time that we enhance the robustness of speech processing models to obtain good performance when encountering speech distortions while not hurting the original performance on clean speech. In this work, we propose to improve the robustness of speech processing models by domain adversarial training (DAT). We conducted experiments based on the SUPERB framework on five different speech processing tasks. In case we do not always have knowledge of the distortion types for speech data, we analyzed the binary-domain and multi-domain settings, where the former treats all distorted speech as one domain, and the latter views different distortions as different domains. In contrast to supervised training methods, we obtained promising results in target domains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · COVID-19 diagnosis using AI · Geophysical Methods and Applications
