CheapNET: Improving Light-weight speech enhancement network by projected loss function
Kaijun Tan, Benzhe Dai, Jiakui Li, Wenyu Mao

TL;DR
CheapNET introduces a novel projection loss function for lightweight speech enhancement models, significantly improving noise suppression and echo cancellation performance while maintaining low computational complexity for real-time applications.
Contribution
The paper proposes a new projection loss function that enhances noise suppression and echo cancellation in lightweight speech enhancement networks, surpassing traditional MSE-based methods.
Findings
Achieves near state-of-the-art noise suppression with 3.1M parameters.
Outperforms industry-leading echo cancellation models.
Maintains low computational load suitable for edge devices.
Abstract
Noise suppression and echo cancellation are critical in speech enhancement and essential for smart devices and real-time communication. Deployed in voice processing front-ends and edge devices, these algorithms must ensure efficient real-time inference with low computational demands. Traditional edge-based noise suppression often uses MSE-based amplitude spectrum mask training, but this approach has limitations. We introduce a novel projection loss function, diverging from MSE, to enhance noise suppression. This method uses projection techniques to isolate key audio components from noise, significantly improving model performance. For echo cancellation, the function enables direct predictions on LAEC pre-processed outputs, substantially enhancing performance. Our noise suppression model achieves near state-of-the-art results with only 3.1M parameters and 0.4GFlops/s computational load.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Phonetics and Phonology Research
