Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition
Thejan Rajapakshe, Rajib Rana, Siddique Latif, Sara Khalifa, and, Bj\"orn W. Schuller

TL;DR
This paper explores pre-training techniques in deep reinforcement learning to enhance speech recognition performance and reduce training time, demonstrating significant improvements on a speech command dataset.
Contribution
It introduces a pre-training approach for deep RL in speech recognition, addressing training efficiency and performance enhancement.
Findings
Significant reduction in training time.
Improved speech recognition accuracy.
Effective pre-training method for deep RL.
Abstract
Deep reinforcement learning (deep RL) is a combination of deep learning with reinforcement learning principles to create efficient methods that can learn by interacting with its environment. This led to breakthroughs in many complex tasks that were previously difficult to solve. However, deep RL requires a large amount of training time that makes it difficult to use in various real-life applications like human-computer interaction (HCI). Therefore, in this paper, we study pre-training in deep RL to reduce the training time and improve the performance in speech recognition, a popular application of HCI. We achieve significantly improved performance in less time on a publicly available speech command recognition dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Reinforcement Learning in Robotics · Speech Recognition and Synthesis
