Preference-Based Learning in Audio Applications: A Systematic Analysis

Aaron Broukhim; Yiran Shen; Prithviraj Ammanabrolu; Nadir Weibel

arXiv:2511.13936·cs.SD·November 19, 2025

Preference-Based Learning in Audio Applications: A Systematic Analysis

Aaron Broukhim, Yiran Shen, Prithviraj Ammanabrolu, Nadir Weibel

PDF

Open Access

TL;DR

This systematic review highlights the emerging role of preference learning in audio applications, emphasizing recent shifts towards generation tasks and the need for standardized benchmarks and datasets.

Contribution

It provides a comprehensive analysis of the sparse application of preference learning in audio, identifying key patterns and future research directions.

Findings

01

Preference learning is underutilized in audio, with only 6% of papers applying it.

02

Post-2021 studies focus on generation tasks using RLHF frameworks.

03

Multi-stage training pipelines and multi-dimensional evaluation strategies are emerging.

Abstract

Despite the parallel challenges that audio and text domains face in evaluating generative model outputs, preference learning remains remarkably underexplored in audio applications. Through a PRISMA-guided systematic review of approximately 500 papers, we find that only 30 (6%) apply preference learning to audio tasks. Our analysis reveals a field in transition: pre-2021 works focused on emotion recognition using traditional ranking methods (rankSVM), while post-2021 studies have pivoted toward generation tasks employing modern RLHF frameworks. We identify three critical patterns: (1) the emergence of multi-dimensional evaluation strategies combining synthetic, automated, and human preferences; (2) inconsistent alignment between traditional metrics (WER, PESQ) and human judgments across different contexts; and (3) convergence on multi-stage training pipelines that combine reward signals.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Emotion and Mood Recognition · Neuroscience and Music Perception