Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds

Hanbin Bae; Pavel Andreev; Azat Saginbaev; Nicholas Babaev; Won-Jun; Lee; Hosang Sung; Hoon-Young Cho

arXiv:2409.18705·eess.AS·September 30, 2024

Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds

Hanbin Bae, Pavel Andreev, Azat Saginbaev, Nicholas Babaev, Won-Jun, Lee, Hosang Sung, Hoon-Young Cho

PDF

TL;DR

This paper presents a low-latency, on-device speech enhancement method for TWS earbuds that improves speech clarity in noisy environments with less than 3 ms latency, balancing quality and computational efficiency.

Contribution

It introduces a novel on-device speech enhancement approach optimized for TWS earbuds, addressing latency and complexity constraints with specific model and hardware design choices.

Findings

01

Significant improvement in speech quality over baseline models

02

Reduced computational complexity and latency

03

Achieved less than 3 ms latency for real-time enhancement

Abstract

This paper introduces a speech enhancement solution tailored for true wireless stereo (TWS) earbuds on-device usage. The solution was specifically designed to support conversations in noisy environments, with active noise cancellation (ANC) activated. The primary challenges for speech enhancement models in this context arise from computational complexity that limits on-device usage and latency that must be less than 3 ms to preserve a live conversation. To address these issues, we evaluated several crucial design elements, including the network architecture and domain, design of loss functions, pruning method, and hardware-specific optimization. Consequently, we demonstrated substantial improvements in speech enhancement quality compared with that in baseline models, while simultaneously reducing the computational complexity and algorithmic latency.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning