BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable   and Efficient Speech Enhancement

Sunwoo Kim; Minje Kim

arXiv:2111.09372·eess.AS·February 11, 2022·1 cites

BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement

Sunwoo Kim, Minje Kim

PDF

Open Access 2 Repos

TL;DR

BLOOM-Net introduces a blockwise optimization approach for scalable speech enhancement networks, allowing dynamic adjustment of complexity with minimal performance loss, enabling flexible resource management.

Contribution

It proposes a novel blockwise training method for scalable masking-based speech enhancement networks, improving flexibility and efficiency over traditional end-to-end training.

Findings

01

Achieves scalability with minimal performance degradation.

02

Allows dynamic adjustment of run-time complexity.

03

Maintains low memory and training overhead.

Abstract

In this paper, we present a blockwise optimization method for masking-based networks (BLOOM-Net) for training scalable speech enhancement networks. Here, we design our network with a residual learning scheme and train the internal separator blocks sequentially to obtain a scalable masking-based deep neural network for speech enhancement. Its scalability lets it dynamically adjust the run-time complexity depending on the test time environment. To this end, we modularize our models in that they can flexibly accommodate varying needs for enhancement performance and constraints on the resources, incurring minimal memory or training overhead due to the added scalability. Our experiments on speech enhancement demonstrate that the proposed blockwise optimization method achieves the desired scalability with only a slight performance degradation compared to corresponding models trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Indoor and Outdoor Localization Technologies · Speech Recognition and Synthesis