Recurrent DNNs and its Ensembles on the TIMIT Phone Recognition Task

Jan Vanek; Josef Michalek; Josef Psutka

arXiv:1806.07186·cs.CL·June 20, 2018

Recurrent DNNs and its Ensembles on the TIMIT Phone Recognition Task

Jan Vanek, Josef Michalek, Josef Psutka

PDF

1 Repo

TL;DR

This paper evaluates recurrent deep neural networks with various regularization techniques and ensembles on the TIMIT phone recognition task, achieving state-of-the-art results and providing open-source code for reproducibility.

Contribution

It demonstrates that ensemble of recurrent DNNs with combined regularization techniques outperforms individual models on TIMIT, setting a new benchmark for phone error rate.

Findings

01

Ensemble of recurrent DNNs achieved PER of 14.84%.

02

Regularization with dropout, zoneout, and post-layer improves performance.

03

Open-source scripts enable easy replication and further research.

Abstract

In this paper, we have investigated recurrent deep neural networks (DNNs) in combination with regularization techniques as dropout, zoneout, and regularization post-layer. As a benchmark, we chose the TIMIT phone recognition task due to its popularity and broad availability in the community. It also simulates a low-resource scenario that is helpful in minor languages. Also, we prefer the phone recognition task because it is much more sensitive to an acoustic model quality than a large vocabulary continuous speech recognition task. In recent years, recurrent DNNs pushed the error rates in automatic speech recognition down. But, there was no clear winner in proposed architectures. The dropout was used as the regularization technique in most cases, but combination with other regularization techniques together with model ensembles was omitted. However, just an ensemble of recurrent DNNs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

OrcusCZ/NNAcousticModeling
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDropout