Prediction-Adaptation-Correction Recurrent Neural Networks for Low-Resource Language Speech Recognition
Yu Zhang, Ekapol Chuangsuwanich, James Glass, Dong Yu

TL;DR
This paper introduces PAC-RNNs, a novel neural network architecture for low-resource speech recognition, leveraging prediction and correction modules with transfer learning to outperform existing models.
Contribution
The paper proposes PAC-RNNs, combining prediction and correction networks with transfer learning, achieving superior performance in low-resource speech recognition tasks.
Findings
PAC-RNNs outperform DNNs and LSTMs on IARPA-Babel tasks.
Transfer learning from similar languages improves recognition accuracy.
The model effectively utilizes auxiliary information for better state estimation.
Abstract
In this paper, we investigate the use of prediction-adaptation-correction recurrent neural networks (PAC-RNNs) for low-resource speech recognition. A PAC-RNN is comprised of a pair of neural networks in which a {\it correction} network uses auxiliary information given by a {\it prediction} network to help estimate the state probability. The information from the correction network is also used by the prediction network in a recurrent loop. Our model outperforms other state-of-the-art neural networks (DNNs, LSTMs) on IARPA-Babel tasks. Moreover, transfer learning from a language that is similar to the target language can help improve performance further.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Natural Language Processing Techniques
