Analyzing Robustness of End-to-End Neural Models for Automatic Speech   Recognition

Goutham Rajendran; Wei Zou

arXiv:2208.08509·cs.CL·August 19, 2022·1 cites

Analyzing Robustness of End-to-End Neural Models for Automatic Speech Recognition

Goutham Rajendran, Wei Zou

PDF

Open Access 1 Repo

TL;DR

This paper analyzes the robustness of pre-trained neural speech recognition models like wav2vec2, HuBERT, and DistilHuBERT against various noise types, providing insights into their layer-wise behavior and error propagation.

Contribution

It offers a comprehensive robustness analysis of popular pre-trained speech models, including layer-wise and error propagation insights under noisy conditions.

Findings

01

Models degrade in performance with increased noise.

02

Layer-wise analysis reveals how noise affects different layers.

03

Error propagation patterns differ between clean and noisy data.

Abstract

We investigate robustness properties of pre-trained neural models for automatic speech recognition. Real life data in machine learning is usually very noisy and almost never clean, which can be attributed to various factors depending on the domain, e.g. outliers, random noise and adversarial noise. Therefore, the models we develop for various tasks should be robust to such kinds of noisy data, which led to the thriving field of robust machine learning. We consider this important issue in the setting of automatic speech recognition. With the increasing popularity of pre-trained models, it's an important question to analyze and understand the robustness of such models to noise. In this work, we perform a robustness analysis of the pre-trained neural models wav2vec2, HuBERT and DistilHuBERT on the LibriSpeech and TIMIT datasets. We use different kinds of noising mechanisms and measure the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

weizou52/robustness_analysis_asr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Neural Networks and Applications