NoLACE: Improving Low-Complexity Speech Codec Enhancement Through   Adaptive Temporal Shaping

Jan B\"uthe; Ahmed Mustafa; Jean-Marc Valin; Karim Helwani; Michael M.; Goodwin

arXiv:2309.14521·eess.AS·January 15, 2024

NoLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal Shaping

Jan B\"uthe, Ahmed Mustafa, Jean-Marc Valin, Karim Helwani, Michael M., Goodwin

PDF

Open Access 1 Repo

TL;DR

NoLACE introduces an adaptive temporal shaping module to enhance low-complexity speech codecs, significantly improving quality over existing methods while maintaining low delay and complexity, especially at low bitrates.

Contribution

The paper proposes NoLACE, a novel adaptive temporal shaping module that enhances the LACE model, leading to better speech quality at low bitrates without increasing complexity.

Findings

01

NoLACE outperforms baseline Opus and larger LACE models at 6, 9, and 12 kb/s.

02

NoLACE improves speech quality while maintaining low complexity.

03

LACE and NoLACE are effective with automatic speech recognition systems.

Abstract

Speech codec enhancement methods are designed to remove distortions added by speech codecs. While classical methods are very low in complexity and add zero delay, their effectiveness is rather limited. Compared to that, DNN-based methods deliver higher quality but they are typically high in complexity and/or require delay. The recently proposed Linear Adaptive Coding Enhancer (LACE) addresses this problem by combining DNNs with classical long-term/short-term postfiltering resulting in a causal low-complexity model. A short-coming of the LACE model is, however, that quality quickly saturates when the model size is scaled up. To mitigate this problem, we propose a novel adatpive temporal shaping module that adds high temporal resolution to the LACE model resulting in the Non-Linear Adaptive Coding Enhancer (NoLACE). We adapt NoLACE to enhance the Opus codec and show that NoLACE…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://gitlab.xiph.org/xiph/opus
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Advanced Data Compression Techniques