Objective Metrics to Evaluate Residual-Echo Suppression During Double-Talk
Amir Ivry, Israel Cohen, Baruch Berdugo

TL;DR
This paper introduces two new objective metrics, DSML and RESL, to separately evaluate speech quality and residual-echo suppression during double-talk, showing they align well with human ratings and aid system tuning.
Contribution
The paper proposes novel metrics that independently measure speech preservation and echo suppression, improving evaluation accuracy over traditional SDR and aiding system design.
Findings
DSML and RESL correlate strongly with DNSMOS scores.
Metrics generalize well across different setups.
Practical scheme for dynamic system tuning is provided.
Abstract
Human subjective evaluation is optimal to assess speech quality for human perception. The recently introduced deep noise suppression mean opinion score (DNSMOS) metric was shown to estimate human ratings with great accuracy. The signal-to-distortion ratio (SDR) metric is widely used to evaluate residual-echo suppression (RES) systems by estimating speech quality during double-talk. However, since the SDR is affected by both speech distortion and residual-echo presence, it does not correlate well with human ratings according to the DNSMOS. To address that, we introduce two objective metrics to separately quantify the desired-speech maintained level (DSML) and residual-echo suppression level (RESL) during double-talk. These metrics are evaluated using a deep learning-based RES-system with a tunable design parameter. Using 280 hours of real and simulated recordings, we show that the DSML…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Ultrasonics and Acoustic Wave Propagation · Structural Health Monitoring Techniques
