On the Design and Training Strategies for RNN-based Online Neural Speech   Separation Systems

Kai Li; Yi Luo

arXiv:2206.07340·cs.SD·February 22, 2023·1 cites

On the Design and Training Strategies for RNN-based Online Neural Speech Separation Systems

Kai Li, Yi Luo

PDF

Open Access

TL;DR

This paper explores methods to convert offline RNN-based neural speech separation systems into effective online systems, reducing performance gaps through layer reorganization and specialized training strategies.

Contribution

It introduces a novel layer decomposition and reorganization approach, along with training strategies, to improve online speech separation performance without retraining from scratch.

Findings

01

Layer decomposition effectively bridges performance gap

02

Training strategies enhance online model accuracy

03

Proposed methods outperform baseline online models

Abstract

While the performance of offline neural speech separation systems has been greatly advanced by the recent development of novel neural network architectures, there is typically an inevitable performance gap between the systems and their online variants. In this paper, we investigate how RNN-based offline neural speech separation systems can be changed into their online counterparts while mitigating the performance degradation. We decompose or reorganize the forward and backward RNN layers in a bidirectional RNN layer to form an online path and an offline path, which enables the model to perform both online and offline processing with a same set of model parameters. We further introduce two training strategies for improving the online model via either a pretrained offline model or a multitask training objective. Experiment results show that compared to the online models that are trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Phonetics and Phonology Research