# Cumulative Adaptation for BLSTM Acoustic Models

**Authors:** Markus Kitza, Pavel Golik, Ralf Schl\"uter, Hermann Ney

arXiv: 1906.06207 · 2019-06-17

## TL;DR

This paper explores cumulative adaptation techniques for BLSTM acoustic models, combining i-vector based adaptation with within-network transformations to improve speech recognition robustness and reduce word error rates.

## Contribution

It introduces a novel cumulative adaptation approach that integrates i-vector based adaptation with within-network transformations for BLSTM acoustic models.

## Key findings

- 8% relative WER reduction with i-vector adaptation
- Additional 5% WER reduction with second-pass adaptation
- Reevaluation of features improves system performance

## Abstract

This paper addresses the robust speech recognition problem as an adaptation task. Specifically, we investigate the cumulative application of adaptation methods. A bidirectional Long Short-Term Memory (BLSTM) based neural network, capable of learning temporal relationships and translation invariant representations, is used for robust acoustic modelling. Further, i-vectors were used as an input to the neural network to perform instantaneous speaker and environment adaptation, providing 8\% relative improvement in word error rate on the NIST Hub5 2000 evaluation test set. By enhancing the first-pass i-vector based adaptation with a second-pass adaptation using speaker and environment dependent transformations within the network, a further relative improvement of 5\% in word error rate was achieved. We have reevaluated the features used to estimate i-vectors and their normalization to achieve the best performance in a modern large scale automatic speech recognition system.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.06207/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/1906.06207/full.md

---
Source: https://tomesphere.com/paper/1906.06207