Learning to Model Aspects of Hearing Perception Using Neural Loss Functions
Prateek Verma, Jonathan Berger

TL;DR
This paper introduces a neural loss function framework that models perceived audio quality, enabling enhancement of degraded musical sounds without parallel data, by combining classical signal processing with neural architectures.
Contribution
The paper presents a novel neural loss function approach that adapts classical signal processing techniques for audio quality enhancement without requiring parallel datasets.
Findings
Shallow neural architectures can effectively enhance audio quality.
Adaptive masks improve perceived acoustical quality.
The method avoids adversarial examples with simple constraints.
Abstract
We present a framework to model the perceived quality of audio signals by combining convolutional architectures, with ideas from classical signal processing, and describe an approach to enhancing perceived acoustical quality. We demonstrate the approach by transforming the sound of an inexpensive musical with degraded sound quality to that of a high-quality musical instrument without the need for parallel data which is often hard to collect. We adapt the classical approach of a simple adaptive EQ filtering to the objective criterion learned by a neural architecture and optimize it to get the signal of our interest. Since we learn adaptive masks depending on the signal of interest as opposed to a fixed transformation for all the inputs, we show that shallow neural architectures can achieve the desired result. A simple constraint on the objective and the initialization helps us in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Neural Networks and Applications
