A Survey on Deep Reinforcement Learning for Audio-Based Applications
Siddique Latif, Heriberto Cuay\'ahuitl, Farrukh Pervez, Fahad, Shamshad, Hafiz Shehbaz Ali, and Erik Cambria

TL;DR
This survey reviews the advancements of deep reinforcement learning in audio applications, highlighting recent progress, challenges, and future research directions in speech and music processing.
Contribution
It provides a comprehensive overview of DRL methods applied to audio signals, consolidating research across speech and music domains and identifying open challenges.
Findings
DRL effectively enhances audio signal processing tasks.
Several applications show improved performance with DRL techniques.
Open research areas include handling complex audio environments.
Abstract
Deep reinforcement learning (DRL) is poised to revolutionise the field of artificial intelligence (AI) by endowing autonomous systems with high levels of understanding of the real world. Currently, deep learning (DL) is enabling DRL to effectively solve various intractable problems in various fields. Most importantly, DRL algorithms are also being employed in audio signal processing to learn directly from speech, music and other sound signals in order to create audio-based autonomous systems that have many promising application in the real world. In this article, we conduct a comprehensive survey on the progress of DRL in the audio domain by bringing together the research studies across different speech and music-related areas. We begin with an introduction to the general field of DL and reinforcement learning (RL), then progress to the main DRL methods and their applications in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
