Supervised Speech Separation Based on Deep Learning: An Overview

DeLiang Wang; Jitong Chen

arXiv:1708.07524·cs.CL·June 18, 2018·58 cites

Supervised Speech Separation Based on Deep Learning: An Overview

DeLiang Wang, Jitong Chen

PDF

Open Access

TL;DR

This paper reviews recent advances in deep learning-based supervised speech separation, highlighting methods, challenges, and progress in separating speech from background noise using neural networks.

Contribution

It provides a comprehensive overview of supervised speech separation techniques, focusing on deep learning methods, training targets, features, and generalization issues.

Findings

01

Deep learning has significantly improved speech separation performance.

02

Various monaural and multi-microphone algorithms have been developed.

03

Generalization remains a key challenge in supervised speech separation.

Abstract

Speech separation is the task of separating target speech from background interference. Traditionally, speech separation is studied as a signal processing problem. A more recent approach formulates speech separation as a supervised learning problem, where the discriminative patterns of speech, speakers, and background noise are learned from training data. Over the past decade, many supervised separation algorithms have been put forward. In particular, the recent introduction of deep learning to supervised speech separation has dramatically accelerated progress and boosted separation performance. This article provides a comprehensive overview of the research on deep learning based supervised speech separation in the last several years. We first introduce the background of speech separation and the formulation of supervised separation. Then we discuss three main components of supervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Hearing Loss and Rehabilitation