Interpreting DNN output layer activations: A strategy to cope with unseen data in speech recognition
Vikramjit Mitra, Horacio Franco

TL;DR
This paper introduces a data-selection strategy for DNN acoustic model adaptation in speech recognition, using output layer activations to identify reliable hypotheses on unseen data, thereby improving adaptation quality.
Contribution
It proposes a novel method to assess hypothesis reliability based on output layer activations, enhancing unsupervised model adaptation for unseen data in speech recognition.
Findings
Improved model adaptation accuracy on unseen data.
Effective identification of reliable hypotheses using activation differences.
Enhanced speech recognition performance in challenging conditions.
Abstract
Unseen data can degrade performance of deep neural net acoustic models. To cope with unseen data, adaptation techniques are deployed. For unlabeled unseen data, one must generate some hypothesis given an existing model, which is used as the label for model adaptation. However, assessing the goodness of the hypothesis can be difficult, and an erroneous hypothesis can lead to poorly trained models. In such cases, a strategy to select data having reliable hypothesis can ensure better model adaptation. This work proposes a data-selection strategy for DNN model adaptation, where DNN output layer activations are used to ascertain the goodness of a generated hypothesis. In a DNN acoustic model, the output layer activations are used to generate target class probabilities. Under unseen data conditions, the difference between the most probable target and the next most probable target is decreased…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
