MuSFA: Improving Music Structural Function Analysis with Partially Labeled Data
Ju-Chiang Wang, Jordan B. L. Smith, Yun-Ning Hung

TL;DR
This paper enhances music structural analysis by leveraging a large, partially labeled dataset (HLSD) to improve boundary detection and section labeling accuracy in a direct prediction framework.
Contribution
It repurposes the HLSD dataset for music structure analysis, demonstrating improved performance by combining it with existing datasets in a partially labeled training approach.
Findings
Improved boundary detection scores by ~3%.
Enhanced section labeling accuracy by ~1%.
Effective use of partially labeled data for music analysis.
Abstract
Music structure analysis (MSA) systems aim to segment a song recording into non-overlapping sections with useful labels. Previous MSA systems typically predict abstract labels in a post-processing step and require the full context of the song. By contrast, we recently proposed a supervised framework, called "Music Structural Function Analysis" (MuSFA), that models and predicts meaningful labels like 'verse' and 'chorus' directly from audio, without requiring the full context of a song. However, the performance of this system depends on the amount and quality of training data. In this paper, we propose to repurpose a public dataset, HookTheory Lead Sheet Dataset (HLSD), to improve the performance. HLSD contains over 18K excerpts of music sections originally collected for studying automatic melody harmonization. We treat each excerpt as a partially labeled song and provide a label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception
