Interpreting DNN output layer activations: A strategy to cope with   unseen data in speech recognition

Vikramjit Mitra; Horacio Franco

arXiv:1802.06861·cs.CL·February 21, 2018

Interpreting DNN output layer activations: A strategy to cope with unseen data in speech recognition

Vikramjit Mitra, Horacio Franco

PDF

TL;DR

This paper introduces a data-selection strategy for DNN acoustic model adaptation in speech recognition, using output layer activations to identify reliable hypotheses on unseen data, thereby improving adaptation quality.

Contribution

It proposes a novel method to assess hypothesis reliability based on output layer activations, enhancing unsupervised model adaptation for unseen data in speech recognition.

Findings

01

Improved model adaptation accuracy on unseen data.

02

Effective identification of reliable hypotheses using activation differences.

03

Enhanced speech recognition performance in challenging conditions.

Abstract

Unseen data can degrade performance of deep neural net acoustic models. To cope with unseen data, adaptation techniques are deployed. For unlabeled unseen data, one must generate some hypothesis given an existing model, which is used as the label for model adaptation. However, assessing the goodness of the hypothesis can be difficult, and an erroneous hypothesis can lead to poorly trained models. In such cases, a strategy to select data having reliable hypothesis can ensure better model adaptation. This work proposes a data-selection strategy for DNN model adaptation, where DNN output layer activations are used to ascertain the goodness of a generated hypothesis. In a DNN acoustic model, the output layer activations are used to generate target class probabilities. Under unseen data conditions, the difference between the most probable target and the next most probable target is decreased…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.