Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization
Longshen Ou, Jingwei Zhao, Ziyu Wang, Gus Xia, Qihao Liang, Torin Hopkins Ye Wang

TL;DR
This paper introduces a unified, pre-trained symbolic music model capable of handling diverse multitrack arrangement tasks through a novel tokenization scheme and a segment-level reconstruction objective, outperforming specialized models.
Contribution
The authors propose REMI-z, a structured tokenization scheme, and a segment-level reconstruction approach enabling a single model to perform various arrangement tasks effectively.
Findings
Outperforms state-of-the-art models in band, piano, and drum arrangements
Enhances modeling efficiency and effectiveness for multiple tasks
Demonstrates broad applicability in symbolic music transformation
Abstract
We present a unified framework for automatic multitrack music arrangement that enables a single pre-trained symbolic music model to handle diverse arrangement scenarios, including reinterpretation, simplification, and additive generation. At its core is a segment-level reconstruction objective operating on token-level disentangled content and style, allowing for flexible any-to-any instrumentation transformations at inference time. To support track-wise modeling, we introduce REMI-z, a structured tokenization scheme for multitrack symbolic music that enhances modeling efficiency and effectiveness for both arrangement tasks and unconditional generation. Our method outperforms task-specific state-of-the-art models on representative tasks in different arrangement scenarios -- band arrangement, piano reduction, and drum arrangement, in both objective metrics and perceptual evaluations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech Recognition and Synthesis
