Phoneme recognition in TIMIT with BLSTM-CTC

Santiago Fern\'andez; Alex Graves; Juergen Schmidhuber

arXiv:0804.3269·cs.CL·April 22, 2008·30 cites

Phoneme recognition in TIMIT with BLSTM-CTC

Santiago Fern\'andez, Alex Graves, Juergen Schmidhuber

PDF

Open Access

TL;DR

This paper demonstrates that a single BLSTM-CTC neural network can achieve phoneme recognition performance on TIMIT comparable to complex classifier combinations, simplifying the approach.

Contribution

The study shows that a single recurrent neural network with BLSTM-CTC can match the performance of more complex classifier ensembles on TIMIT phoneme recognition.

Findings

01

Achieved 24.6% error rate on TIMIT phoneme recognition.

02

Single BLSTM-CTC network performs comparably to classifier combinations.

03

Simplifies phoneme recognition approach without sacrificing accuracy.

Abstract

We compare the performance of a recurrent neural network with the best results published so far on phoneme recognition in the TIMIT database. These published results have been obtained with a combination of classifiers. However, in this paper we apply a single recurrent neural network to the same task. Our recurrent neural network attains an error rate of 24.6%. This result is not significantly different from that obtained by the other best methods, but they rely on a combination of classifiers for achieving comparable performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing