Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic   Models

Apoorv Vyas; Srikanth Madikeri; Herv\'e Bourlard

arXiv:2012.14252·cs.LG·April 7, 2021

Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models

Apoorv Vyas, Srikanth Madikeri, Herv\'e Bourlard

PDF

2 Repos

TL;DR

This paper introduces a lattice-free MMI adaptation method for self-supervised pretrained acoustic models, demonstrating significant WER improvements across multiple datasets and languages.

Contribution

It presents a novel LFMMI-based supervised adaptation technique for self-supervised pretrained models, improving speech recognition accuracy.

Findings

01

10-35% relative WER reduction on Librispeech

02

10.8% WER reduction on Switchboard

03

4-4.3% WER reduction on Swahili and Tagalog

Abstract

In this work, we propose lattice-free MMI (LFMMI) for supervised adaptation of self-supervised pretrained acoustic model. We pretrain a Transformer model on thousand hours of untranscribed Librispeech data followed by supervised adaptation with LFMMI on three different datasets. Our results show that fine-tuning with LFMMI, we consistently obtain relative WER improvements of 10% and 35.3% on the clean and other test sets of Librispeech (100h), 10.8% on Switchboard (300h), and 4.3% on Swahili (38h) and 4.4% on Tagalog (84h) compared to the baseline trained only with supervised data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Attention Is All You Need · Dropout · Adam · Multi-Head Attention · Residual Connection · Byte Pair Encoding