# Lattice-based lightly-supervised acoustic model training

**Authors:** Joachim Fainberg, Ond\v{r}ej Klejch, Steve Renals, Peter Bell

arXiv: 1905.13150 · 2019-07-16

## TL;DR

This paper introduces a lattice-based lightly-supervised training method for acoustic models that combines inaccurate transcriptions with lattice generation, improving error rates in broadcast speech recognition.

## Contribution

It proposes a novel technique to integrate imperfect transcriptions with lattice-based semi-supervised training, enhancing model accuracy.

## Key findings

- Reduces expected error rates over lattices.
- Decreases word error rate (WER) on broadcast tasks.
- Improves robustness of semi-supervised acoustic training.

## Abstract

In the broadcast domain there is an abundance of related text data and partial transcriptions, such as closed captions and subtitles. This text data can be used for lightly supervised training, in which text matching the audio is selected using an existing speech recognition model. Current approaches to light supervision typically filter the data based on matching error rates between the transcriptions and biased decoding hypotheses. In contrast, semi-supervised training does not require matching text data, instead generating a hypothesis using a background language model. State-of-the-art semi-supervised training uses lattice-based supervision with the lattice-free MMI (LF-MMI) objective function. We propose a technique to combine inaccurate transcriptions with the lattices generated for semi-supervised training, thus preserving uncertainty in the lattice where appropriate. We demonstrate that this combined approach reduces the expected error rates over the lattices, and reduces the word error rate (WER) on a broadcast task.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.13150/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1905.13150/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1905.13150/full.md

---
Source: https://tomesphere.com/paper/1905.13150