Beam-Guided TasNet: An Iterative Speech Separation Framework with   Multi-Channel Output

Hangting Chen; Yang Yi; Dang Feng; Pengyuan Zhang

arXiv:2102.02998·eess.AS·April 13, 2022·Interspeech

Beam-Guided TasNet: An Iterative Speech Separation Framework with Multi-Channel Output

Hangting Chen, Yang Yi, Dang Feng, Pengyuan Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces Beam-Guided TasNet, an iterative multi-channel speech separation framework that enhances separation performance by integrating neural network-based separation with beamforming in a cyclic, mutually reinforcing manner.

Contribution

It proposes a novel cyclic framework allowing multi-channel input and output, enabling iterative refinement and improved performance over traditional Beam-TasNet and approaching oracle MVDR results.

Findings

01

Achieved SDR of 21.5 dB on spatialized WSJ0-2MIX

02

Exceeded baseline Beam-TasNet by 4.1 dB SDR

03

Narrowed performance gap with oracle MVDR to 2 dB

Abstract

Time-domain audio separation network (TasNet) has achieved remarkable performance in blind source separation (BSS). Classic multi-channel speech processing framework employs signal estimation and beamforming. For example, Beam-TasNet links multi-channel convolutional TasNet (MC-Conv-TasNet) with minimum variance distortionless response (MVDR) beamforming, which leverages the strong modeling ability of data-driven network and boosts the performance of beamforming with an accurate estimation of speech statistics. Such integration can be viewed as a directed acyclic graph by accepting multi-channel input and generating multi-source output. In this paper, we design a "multi-channel input, multi-channel multi-source output" (MIMMO) speech separation system entitled "Beam-Guided TasNet", where MC-Conv-TasNet and MVDR can interact and promote each other more compactly under a directed cyclic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hangtingchen/Beam-Guided-TasNet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Music and Audio Processing