A Differentiable Perceptual Audio Metric Learned from Just Noticeable   Differences

Pranay Manocha; Adam Finkelstein; Richard Zhang; Nicholas J. Bryan,; Gautham J. Mysore; Zeyu Jin

arXiv:2001.04460·eess.AS·May 19, 2020·5 cites

A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences

Pranay Manocha, Adam Finkelstein, Richard Zhang, Nicholas J. Bryan,, Gautham J. Mysore, Zeyu Jin

PDF

Open Access 1 Repo

TL;DR

This paper introduces a deep neural network-based perceptual audio metric trained on crowdsourced human judgments to accurately reflect human perception of audio differences, especially near the just-noticeable difference threshold.

Contribution

It presents a novel differentiable perceptual audio metric learned from a large dataset of human judgments, improving correlation with human perception over existing metrics.

Findings

01

The learned metric outperforms baseline methods in correlating with human judgments.

02

Replacing traditional loss functions with this metric improves audio denoising results.

03

The metric is effective as a differentiable loss function for audio processing tasks.

Abstract

Many audio processing tasks require perceptual assessment. The ``gold standard`` of obtaining human judgments is time-consuming, expensive, and cannot be used as an optimization criterion. On the other hand, automated metrics are efficient to compute but often correlate poorly with human judgment, particularly for audio differences at the threshold of human detection. In this work, we construct a metric by fitting a deep neural network to a new large dataset of crowdsourced human judgments. Subjects are prompted to answer a straightforward, objective question: are two recordings identical or not? These pairs are algorithmically generated under a variety of perturbations, including noise, reverb, and compression artifacts; the perturbation space is probed with the goal of efficiently identifying the just-noticeable difference (JND) level of the subject. We show that the resulting learned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pranaymanocha/PerceptualAudio
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Hearing Loss and Rehabilitation