Sequence Training and Adaptation of Highway Deep Neural Networks

Liang Lu

arXiv:1607.01963·cs.CL·March 23, 2017·1 cites

Sequence Training and Adaptation of Highway Deep Neural Networks

Liang Lu

PDF

Open Access

TL;DR

This paper explores the use of highway deep neural networks with tied gate functions for speech recognition, demonstrating improved accuracy through sequence training and speaker adaptation by updating gate functions alone.

Contribution

It introduces a sequence training and speaker adaptation approach for highway deep neural networks with tied gate functions, showing these gates effectively control information flow and enhance recognition performance.

Findings

01

Sequence-discriminative training improves accuracy.

02

Speaker adaptation benefits from updating gate functions.

03

Tied gate functions effectively control information flow.

Abstract

Highway deep neural network (HDNN) is a type of depth-gated feedforward neural network, which has shown to be easier to train with more hidden layers and also generalise better compared to conventional plain deep neural networks (DNNs). Previously, we investigated a structured HDNN architecture for speech recognition, in which the two gate functions were tied across all the hidden layers, and we were able to train a much smaller model without sacrificing the recognition accuracy. In this paper, we carry on the study of this architecture with sequence-discriminative training criterion and speaker adaptation techniques on the AMI meeting speech recognition corpus. We show that these two techniques improve speech recognition accuracy on top of the model trained with the cross entropy criterion. Furthermore, we demonstrate that the two gate functions that are tied across all the hidden…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing