When Noise Lowers The Loss: Rethinking Likelihood-Based Evaluation in Music Large Language Models

Xiaosha Li; Chun Liu; Ziyu Wang

arXiv:2602.02738·cs.SD·February 4, 2026

When Noise Lowers The Loss: Rethinking Likelihood-Based Evaluation in Music Large Language Models

Xiaosha Li, Chun Liu, Ziyu Wang

PDF

Open Access

TL;DR

This paper reveals that in music large language models, the loss can decrease with corrupted music, challenging its use as a quality metric, and proposes a new evaluation method based on loss curve shape to better assess musical quality.

Contribution

It introduces a noise injection method to analyze model responses, demonstrating that loss curve shape reflects musical quality and proposing a profile-based evaluation framework.

Findings

01

Models respond more to local disruptions than global corruption.

02

Loss curve shape encodes information about musical quality.

03

Proposed evaluation is label-free and model-intrinsic.

Abstract

The rise of music large language models (LLMs) demands robust methods of evaluating output quality, especially in distinguishing high-quality compositions from "garbage music". Curiously, we observe that the standard cross-entropy loss -- a core training metric -- often decrease when models encounter systematically corrupted music, undermining its validity as a standalone quality indicator. To investigate this paradox, we introduce noise injection experiment, where controlled noise signal of varying lengths are injected into musical contexts. We hypothesize that a model's loss reacting positively to these perturbations, specifically a sharp increase ("Peak" area) for short injection, can serve as a proxy for its ability to discern musical integrity. Experiments with MusicGen models in the audio waveform domain confirm that Music LLMs respond more strongly to local, texture-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Generative Adversarial Networks and Image Synthesis