ARIA: A Diagnostic Framework for Music Training Data Attribution

Changheon Han; Ashkan Panahi; K{\i}van\c{c} Tatar

arXiv:2605.16181·cs.SD·May 18, 2026

ARIA: A Diagnostic Framework for Music Training Data Attribution

Changheon Han, Ashkan Panahi, K{\i}van\c{c} Tatar

PDF

TL;DR

ARIA is a comprehensive framework for music data attribution that decomposes influence into musical aspects and assesses the reliability of attribution methods, enhancing understanding of how generated music relates to training data.

Contribution

It introduces a novel decomposition approach for attribution along musical aspects and provides diagnostics to evaluate attribution reliability, advancing music copyright analysis tools.

Findings

01

Reliability diagnostics rank four attribution methods accurately against ground truth.

02

ARIA reveals significant variation in attribution behaviors across methods.

03

It characterizes embedding-similarity retrieval baselines by musical aspect.

Abstract

Training data attribution (TDA) for music generation must answer two questions that copyright analysis requires, namely which training songs influence a generated output and along which musical aspects the influence operates. Existing methods reduce influence to a single scalar, without revealing which musical aspects are dominant in that influence. We propose ARIA, a framework that decomposes attribution along musical aspects (five for symbolic music, three for audio) and pairs the decomposition with reliability diagnostics computed from the segment-level score matrix. It measures within-group similarity among the top-K attributed tracks against random reference groups drawn from the training pool, and diagnoses the score matrix through its singular value decomposition and column statistics. On a symbolic-music model where attribution ground truth is available through counterfactual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.