A Review of Deep Learning Techniques for Speech Processing
Ambuj Mehrish, Navonil Majumder, Rishabh Bhardwaj, Rada Mihalcea,, Soujanya Poria

TL;DR
This review comprehensively covers deep learning models and their applications in speech processing, highlighting recent advances, challenges, and future research directions in the field.
Contribution
It provides a detailed overview of the evolution, categorization, and comparison of deep learning techniques used in speech processing tasks.
Findings
Deep learning models have significantly improved speech recognition and synthesis.
Transformers and diffusion models are among the latest architectures used.
Challenges include developing more efficient and interpretable models.
Abstract
The field of speech processing has undergone a transformative shift with the advent of deep learning. The use of multiple processing layers has enabled the creation of models capable of extracting intricate features from speech data. This development has paved the way for unparalleled advancements in speech recognition, text-to-speech synthesis, automatic speech recognition, and emotion recognition, propelling the performance of these tasks to unprecedented heights. The power of deep learning techniques has opened up new avenues for research and innovation in the field of speech processing, with far-reaching implications for a range of industries and applications. This review paper provides a comprehensive overview of the key deep learning models and their applications in speech-processing tasks. We begin by tracing the evolution of speech processing research, from early approaches,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing
MethodsDiffusion
