# The First Cadenza Challenge: Perceptual Evaluation of Machine Learning Systems to Improve Audio Quality of Popular Music for Those with Hearing Loss

**Authors:** Scott Bannister, Jennifer Firth, Gerardo Roa-Dabike, Rebecca Vos, William Whitmer, Alinka E. Greasley, Simone Graetzer, Bruno Fazenda, Trevor Cox, Jon Barker, Michael A. Akeroyd

PMC · DOI: 10.1177/23312165251408761 · 2026-01-30

## TL;DR

This study tested machine learning systems to improve music quality for people with hearing loss, finding that none outperformed a baseline method.

## Contribution

The first perceptual evaluation of machine learning systems for music enhancement in hearing aid users across varying hearing loss severities.

## Key findings

- No submitted system outperformed the baseline HDemucs-based audio enhancement method.
- Clarity and distortion ratings were most predictive of overall audio quality for hearing aid users.
- Systems with higher objective loudness and clipping received lower quality ratings for moderately severe hearing loss.

## Abstract

Music is central to many people's lives, and hearing loss (HL) is often a barrier to musical engagement. Hearing aids (HAs) help, but their efficacy in improving speech does not consistently translate to music. This research evaluated systems submitted to the 1st Cadenza Machine Learning Challenge, where entrants aimed to improve music audio quality for HA users through source separation and remixing. The HA users (N = 53, ranging from “mild” to “moderately severe” HL) assessed eight challenge systems (including one baseline using the HDemucs source separation algorithm, remixing to original mixes of music samples, and applying National Acoustic Laboratories Revised amplification) and rated 200 music samples processed for their HL. Participants rated samples on basic audio quality, clarity, harshness, distortion, frequency balance, and liking. Results suggest no entrant system surpassed the baseline for audio quality, although differences emerged in system efficacy across HL severities. Clarity and distortion ratings were most predictive of audio quality. Finally, some systems produced signals with higher objective loudness, spectral flux and clipping with increasing HL severity; these received lower audio quality ratings by listeners with moderately severe HL. Findings highlight how music enhancement requires varied solutions and tests across a range of HL severities. This challenge provided a first application of source separation to music listening with HL. However, state-of-the-art source separation algorithms limited the diversity of entrant solutions, resulting in no improvements over the baseline; to promote development of innovative processing strategies, future work should increase complexity of music listening scenarios to be addressed through source separation.

## Linked entities

- **Diseases:** hearing loss (MONDO:0005365)

## Full-text entities

- **Diseases:** ORCID iDs (MESH:C535742), fatigue (MESH:D005221), HAs (MESH:D034381), Meniere's disease (MESH:D008575), tinnitus (MESH:D014012), ML (MESH:D007859), hyperacusis (MESH:D012001)
- **Chemicals:** BAQ (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

16 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12858752/full.md

---
Source: https://tomesphere.com/paper/PMC12858752