Causal Signal-Based DCCRN with Overlapped-Frame Prediction for Online Speech Enhancement
Julitta Bartolewska, Stanis{\l}aw Kacprzak, Konrad Kowalczyk

TL;DR
This paper introduces a causal DCCRN model with overlapped-frame prediction for online speech enhancement, reducing latency and model size while maintaining or improving performance.
Contribution
It proposes novel modifications to DCCRN, including complex filtering, overlapped-frame prediction, and causal convolutions, enabling efficient real-time speech enhancement.
Findings
Achieves similar or better speech quality metrics compared to original DCCRN.
Reduces latency and network parameters by approximately 30%.
Maintains performance with minimal look-ahead requirements.
Abstract
The aim of speech enhancement is to improve speech signal quality and intelligibility from a noisy microphone signal. In many applications, it is crucial to enable processing with small computational complexity and minimal requirements regarding access to future signal samples (look-ahead). This paper presents signal-based causal DCCRN that improves online single-channel speech enhancement by reducing the required look-ahead and the number of network parameters. The proposed modifications include complex filtering of the signal, application of overlapped-frame prediction, causal convolutions and deconvolutions, and modification of the loss function. Results of performed experiments indicate that the proposed model with overlapped signal prediction and additional adjustments, achieves similar or better performance than the original DCCRN in terms of various speech enhancement metrics,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
