TL;DR
This paper introduces a new 'Zeta policy' for deep reinforcement learning tailored to Speech Emotion Recognition, demonstrating improved performance and faster learning through pre-training and cross-dataset evaluation.
Contribution
The paper presents a novel 'Zeta policy' specifically designed for SER and explores pre-training in deep RL to enhance learning speed and robustness across datasets.
Findings
Zeta policy outperforms existing policies in SER tasks.
Pre-training reduces training time and warm-up period.
Pre-training is effective across different datasets.
Abstract
Reinforcement Learning (RL) is a semi-supervised learning paradigm which an agent learns by interacting with an environment. Deep learning in combination with RL provides an efficient method to learn how to interact with the environment is called Deep Reinforcement Learning (deep RL). Deep RL has gained tremendous success in gaming - such as AlphaGo, but its potential have rarely being explored for challenging tasks like Speech Emotion Recognition (SER). The deep RL being used for SER can potentially improve the performance of an automated call centre agent by dynamically learning emotional-aware response to customer queries. While the policy employed by the RL agent plays a major role in action selection, there is no current RL policy tailored for SER. In addition, extended learning period is a general challenge for deep RL which can impact the speed of learning for SER. Therefore, in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network
