Phoneme recognition in TIMIT with BLSTM-CTC
Santiago Fern\'andez, Alex Graves, Juergen Schmidhuber

TL;DR
This paper demonstrates that a single BLSTM-CTC neural network can achieve phoneme recognition performance on TIMIT comparable to complex classifier combinations, simplifying the approach.
Contribution
The study shows that a single recurrent neural network with BLSTM-CTC can match the performance of more complex classifier ensembles on TIMIT phoneme recognition.
Findings
Achieved 24.6% error rate on TIMIT phoneme recognition.
Single BLSTM-CTC network performs comparably to classifier combinations.
Simplifies phoneme recognition approach without sacrificing accuracy.
Abstract
We compare the performance of a recurrent neural network with the best results published so far on phoneme recognition in the TIMIT database. These published results have been obtained with a combination of classifiers. However, in this paper we apply a single recurrent neural network to the same task. Our recurrent neural network attains an error rate of 24.6%. This result is not significantly different from that obtained by the other best methods, but they rely on a combination of classifiers for achieving comparable performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing
