Versatile Symbolic Music-for-Music Modeling via Function Alignment
Junyan Jiang, Daniel Chin, Liwei Lin, Xuanjie Liu, Gus Xia

TL;DR
This paper introduces a versatile, parameter-efficient approach for symbolic music modeling that unifies understanding and generation tasks using pretrained language models linked by lightweight adapters.
Contribution
It proposes a novel method that leverages pretrained language models with adapters for various symbolic music tasks, improving performance and unification.
Findings
Superior performance in chord recognition, melody, and drum track generation
Effective unification of understanding and generation tasks
Parameter-efficient model design
Abstract
Many music AI models learn a map between music content and human-defined labels. However, many annotations, such as chords, can be naturally expressed within the music modality itself, e.g., as sequences of symbolic notes. This observation enables both understanding tasks (e.g., chord recognition) and conditional generation tasks (e.g., chord-conditioned melody generation) to be unified under a music-for-music sequence modeling paradigm. In this work, we propose parameter-efficient solutions for a variety of symbolic music-for-music tasks. The high-level idea is that (1) we utilize a pretrained Language Model (LM) for both the reference and the target sequence and (2) we link these two LMs via a lightweight adapter. Experiments show that our method achieves superior performance among different tasks such as chord recognition, melody generation, and drum track generation. All demos, code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception
