Towards Improved Objective Perceptual Audio Quality Assessment -- Part   1: A Novel Data-Driven Cognitive Model

Pablo M. Delgado; J\"urgen Herre

arXiv:2411.18222·eess.AS·November 28, 2024

Towards Improved Objective Perceptual Audio Quality Assessment -- Part 1: A Novel Data-Driven Cognitive Model

Pablo M. Delgado, J\"urgen Herre

PDF

TL;DR

This paper introduces a novel machine learning approach that models cognitive aspects of audio quality perception to improve the generalization of objective audio quality assessment tools, especially for unseen signals and distortions.

Contribution

It proposes a new adaptive weighting method for distortion metrics based on subjective data, enhancing prediction accuracy and generalization in audio quality assessment.

Findings

01

Achieves higher prediction accuracy on unseen data.

02

Models cognitive effects of distortions.

03

Offers a manageable alternative to complex machine learning models.

Abstract

Efficient audio quality assessment is vital for streamlining audio codec development. Objective assessment tools have been developed over time to algorithmically predict quality ratings from subjective assessments, the gold standard for quality judgment. Many of these tools use perceptual auditory models to extract audio features that are mapped to a basic audio quality score prediction using machine learning algorithms and subjective scores as training data. However, existing tools struggle with generalization in quality prediction, especially when faced with unknown signal and distortion types. This is particularly evident in the presence of signals coded using non-waveform-preserving parametric techniques. Addressing these challenges, this two-part work proposes extensions to the Perceptual Evaluation of Audio Quality (PEAQ - ITU-R BS.1387-1) recommendation. Part 1 focuses on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.