Loud-loss: A Perceptually Motivated Loss Function for Speech Enhancement Based on Equal-Loudness Contours
Zixuan Li, Xueliang Zhang, Changjiang Zhao, Shuai Gao, Lei Miao, Zhipeng Yan, Ying Sun, Chong Zhu

TL;DR
This paper introduces a perceptually-weighted loss function for speech enhancement that uses equal-loudness contours to better align with human auditory perception, improving perceptual quality over traditional MSE.
Contribution
It proposes a novel psychoacoustically motivated loss function based on equal-loudness contours, enhancing speech enhancement models' perceptual performance.
Findings
Significant WB-PESQ score improvement from 2.17 to 2.93.
Loss function is model-agnostic and flexible.
Enhanced perceptual quality in speech enhancement.
Abstract
The mean squared error (MSE) is a ubiquitous loss function for speech enhancement, but its problem is that the error cannot reflect the auditory perception quality. This is because MSE causes models to over-emphasize low-frequency components which has high energy, leading to the inadequate modeling of perceptually important high-frequency information. To overcome this limitation, we propose a perceptually-weighted loss function grounded in psychoacoustic principles. Specifically, it leverages equal-loudness contours to assign frequency-dependent weights to the reconstruction error, thereby penalizing deviations in a way aligning with human auditory sensitivity. The proposed loss is model-agnostic and flexible, demonstrating strong generality. Experiments on the VoiceBank+DEMAND dataset show that replacing MSE with our loss in a GTCRN model elevates the WB-PESQ score from 2.17 to 2.93-a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Advanced Adaptive Filtering Techniques
