Warped Language Models for Noise Robust Language Understanding
Mahdi Namazifar, Gokhan Tur, Dilek Hakkani T\"ur

TL;DR
This paper introduces Warped Language Models (WLM), which incorporate insertion and dropping of tokens during training to improve robustness of language understanding systems against speech recognition noise.
Contribution
The paper proposes a novel training method for language models that enhances robustness to ASR errors by simulating noise through token insertion and dropping.
Findings
WLM outperforms traditional MLM in noisy speech scenarios.
WLM shows improved accuracy in spoken language understanding tasks.
Training with noise-like modifications enhances model robustness.
Abstract
Masked Language Models (MLM) are self-supervised neural networks trained to fill in the blanks in a given sentence with masked tokens. Despite the tremendous success of MLMs for various text based tasks, they are not robust for spoken language understanding, especially for spontaneous conversational speech recognition noise. In this work we introduce Warped Language Models (WLM) in which input sentences at training time go through the same modifications as in MLM, plus two additional modifications, namely inserting and dropping random tokens. These two modifications extend and contract the sentence in addition to the modifications in MLMs, hence the word "warped" in the name. The insertion and drop modification of the input text during training of WLM resemble the types of noise due to Automatic Speech Recognition (ASR) errors, and as a result WLMs are likely to be more robust to ASR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling
