Early Stage LM Integration Using Local and Global Log-Linear Combination

Wilfried Michel; Ralf Schl\"uter; Hermann Ney

arXiv:2005.10049·eess.AS·May 21, 2020

Early Stage LM Integration Using Local and Global Log-Linear Combination

Wilfried Michel, Ralf Schl\"uter, Hermann Ney

PDF

TL;DR

This paper introduces a novel log-linear combination method for integrating external language models into sequence-to-sequence speech recognition models, improving performance and flexibility over traditional shallow fusion techniques.

Contribution

It proposes a per-token renormalization approach for language model integration, enabling efficient full normalization and better performance than shallow fusion.

Findings

01

Significant WER reduction over shallow fusion.

02

Persistent improvements even with different LMs post-training.

03

Efficient computation of normalization terms in training and testing.

Abstract

Sequence-to-sequence models with an implicit alignment mechanism (e.g. attention) are closing the performance gap towards traditional hybrid hidden Markov models (HMM) for the task of automatic speech recognition. One important factor to improve word error rate in both cases is the use of an external language model (LM) trained on large text-only corpora. Language model integration is straightforward with the clear separation of acoustic model and language model in classical HMM-based modeling. In contrast, multiple integration schemes have been proposed for attention models. In this work, we present a novel method for language model integration into implicit-alignment based sequence-to-sequence models. Log-linear model combination of acoustic and language model is performed with a per-token renormalization. This allows us to compute the full normalization term efficiently both in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.