LACE: A light-weight, causal model for enhancing coded speech through adaptive convolutions
Jan B\"uthe, Jean-Marc Valin, Ahmed Mustafa

TL;DR
This paper introduces LACE, a lightweight causal neural network that enhances coded speech quality by generating adaptive filters with minimal complexity, suitable for real-time applications like mobile devices.
Contribution
LACE is a novel DNN model with only 300K parameters that generates adaptive filter kernels for speech enhancement without adding delay, enabling practical deployment.
Findings
Effective enhancement at bitrates as low as 6 kb/s
Low-complexity model suitable for mobile CPUs
Integrates seamlessly into the Opus codec
Abstract
Classical speech coding uses low-complexity postfilters with zero lookahead to enhance the quality of coded speech, but their effectiveness is limited by their simplicity. Deep Neural Networks (DNNs) can be much more effective, but require high complexity and model size, or added delay. We propose a DNN model that generates classical filter kernels on a per-frame basis with a model of just 300~K parameters and 100~MFLOPS complexity, which is a practical complexity for desktop or mobile device CPUs. The lack of added delay allows it to be integrated into the Opus codec, and we demonstrate that it enables effective wideband encoding for bitrates down to 6 kb/s.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Data Compression Techniques · Speech Recognition and Synthesis
