Towards Automatic Instrumentation by Learning to Separate Parts in Symbolic Multitrack Music
Hao-Wen Dong, Chris Donahue, Taylor Berg-Kirkpatrick, Julian McAuley

TL;DR
This paper explores automatic instrument assignment in symbolic multitrack music by learning to separate parts from mixtures, enabling real-time and offline applications like assistive composing and instrument switching.
Contribution
It introduces a machine learning approach to separate musical parts from mixtures, framing it as a sequential classification problem, and demonstrates its effectiveness across diverse datasets.
Findings
Proposed models outperform baselines in part separation.
Models can generate convincing alternative instrumentations.
Effective across multiple genres and ensemble types.
Abstract
Modern keyboards allow a musician to play multiple instruments at the same time by assigning zones -- fixed pitch ranges of the keyboard -- to different instruments. In this paper, we aim to further extend this idea and examine the feasibility of automatic instrumentation -- dynamically assigning instruments to notes in solo music during performance. In addition to the online, real-time-capable setting for performative use cases, automatic instrumentation can also find applications in assistive composing tools in an offline setting. Due to the lack of paired data of original solo music and their full arrangements, we approach automatic instrumentation by learning to separate parts (e.g., voices, instruments and tracks) from their mixture in symbolic multitrack music, assuming that the mixture is to be played on a keyboard. We frame the task of part separation as a sequential multi-class…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
