Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study
Keyu An, Shiliang Zhang, Zhijie Yan

TL;DR
This paper empirically evaluates the use of pre-trained transformer models as encoders in ASR systems, demonstrating improved error rates and highlighting their potential to enhance speech recognition performance by leveraging semantic features.
Contribution
It provides the first comprehensive empirical analysis of transformers from pre-trained language models as effective ASR encoders, showing their benefits over traditional methods.
Findings
Transformers improve CER and WER in ASR tasks.
Pre-trained transformers serve as advantageous initializations for ASR encoders.
Semantic capabilities of transformers enhance ASR performance in complex scenarios.
Abstract
In this study, we delve into the efficacy of transformers within pre-trained language models (PLMs) when repurposed as encoders for Automatic Speech Recognition (ASR). Our underlying hypothesis posits that, despite being initially trained on text-based corpora, these transformers possess a remarkable capacity to extract effective features from the input sequence. This inherent capability, we argue, is transferrable to speech data, thereby augmenting the acoustic modeling ability of ASR. Through rigorous empirical analysis, our findings reveal a notable improvement in Character Error Rate (CER) and Word Error Rate (WER) across diverse ASR tasks when transformers from pre-trained LMs are incorporated. Particularly, they serve as an advantageous starting point for initializing ASR encoders. Furthermore, we uncover that these transformers, when integrated into a well-established ASR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Sensor Technology and Measurement Systems · Neural Networks and Applications
