Complexity Scaling for Speech Denoising

Hangting Chen; Jianwei Yu; Chao Weng

arXiv:2309.07757·eess.AS·September 15, 2023

Complexity Scaling for Speech Denoising

Hangting Chen, Jianwei Yu, Chao Weng

PDF

Open Access

TL;DR

This paper introduces a unified Multi-Path Transform architecture for speech denoising that scales across various computational complexities, demonstrating a predictable relationship between model size and performance.

Contribution

The study proposes a novel scalable architecture for speech denoising and explores the empirical relationship between model complexity and performance, unifying models across different complexity levels.

Findings

01

High-performance models across a wide complexity range

02

Linear increase in PESQ-WB and SI-SNR with log of MACs

03

Unified architecture simplifies deployment for diverse devices

Abstract

Computational complexity is critical when deploying deep learning-based speech denoising models for on-device applications. Most prior research focused on optimizing model architectures to meet specific computational cost constraints, often creating distinct neural network architectures for different complexity limitations. This study conducts complexity scaling for speech denoising tasks, aiming to consolidate models with various complexities into a unified architecture. We present a Multi-Path Transform-based (MPT) architecture to handle both low- and high-complexity scenarios. A series of MPT networks present high performance covering a wide range of computational complexities on the DNS challenge dataset. Moreover, inspired by the scaling experiments in natural language processing, we explore the empirical relationship between model performance and computational cost on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing