Convolutional Attention-based Seq2Seq Neural Network for End-to-End ASR

Dan Lim

arXiv:1710.04515·cs.CL·October 13, 2017·1 cites

Convolutional Attention-based Seq2Seq Neural Network for End-to-End ASR

Dan Lim

PDF

Open Access

TL;DR

This paper presents a convolutional attention-based sequence-to-sequence neural network with advanced techniques like batch normalization and residual connections, achieving competitive phoneme error rates for end-to-end automatic speech recognition.

Contribution

It introduces a novel convolutional attention-based seq2seq model incorporating modern neural network algorithms for improved speech recognition performance.

Findings

01

Achieved 15.8% phoneme error rate on TIMIT dataset

02

Demonstrated effectiveness of the proposed model for end-to-end ASR

03

Integrated multiple neural network techniques to enhance accuracy

Abstract

This thesis introduces the sequence to sequence model with Luong's attention mechanism for end-to-end ASR. It also describes various neural network algorithms including Batch normalization, Dropout and Residual network which constitute the convolutional attention-based seq2seq neural network. Finally the proposed model proved its effectiveness for speech recognition achieving 15.8% phoneme error rate on TIMIT dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Geophysical Methods and Applications · Fault Detection and Control Systems

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence · Dropout