DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Chandan K A Reddy, Vishak Gopal, Ross Cutler

TL;DR
This paper introduces DNSMOS, a no-reference perceptual speech quality metric that accurately evaluates noise suppression algorithms by correlating well with human ratings, without needing clean reference signals.
Contribution
The paper presents a novel multi-stage self-teaching approach for a no-reference speech quality metric tailored for noise suppressors, improving correlation with human perception.
Findings
High correlation with human ratings in challenging conditions
Effective evaluation of noise suppressors without clean references
Generalizes well across diverse test scenarios
Abstract
Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. The conventional and widely used metrics require a reference clean speech signal, which is unavailable in real recordings. The no-reference approaches correlate poorly with human ratings and are not widely adopted in the research community. One of the biggest use cases of these perceptual objective metrics is to evaluate noise suppression algorithms. This paper introduces a multi-stage self-teaching based perceptual objective metric that is designed to evaluate noise suppressors. The proposed method generalizes well in challenging test conditions with a high correlation to human ratings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
