The "Horse'' Inside: Seeking Causes Behind the Behaviours of Music Content Analysis Systems
Bob L. Sturm

TL;DR
This paper critically examines a state-of-the-art music content analysis system to understand its true capabilities, limitations, and the nature of its learned knowledge, revealing potential misconceptions about its performance.
Contribution
It dissects a leading music analysis system to uncover what it actually learns and performs, offering insights for developing more reliable music understanding tools.
Findings
High accuracy does not necessarily imply true understanding
System's sensitivities and limitations are identified
Guidelines for improving music content analysis systems are proposed
Abstract
Building systems that possess the sensitivity and intelligence to identify and describe high-level attributes in music audio signals continues to be an elusive goal, but one that surely has broad and deep implications for a wide variety of applications. Hundreds of papers have so far been published toward this goal, and great progress appears to have been made. Some systems produce remarkable accuracies at recognising high-level semantic concepts, such as music style, genre and mood. However, it might be that these numbers do not mean what they seem. In this paper, we take a state-of-the-art music content analysis system and investigate what causes it to achieve exceptionally high performance in a benchmark music audio dataset. We dissect the system to understand its operation, determine its sensitivities and limitations, and predict the kinds of knowledge it could and could not possess…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Diverse Musicological Studies
