TL;DR
This paper presents a two-stage training approach that leverages large-scale general music pre-training and small-scale fine-tuning with style indicators to improve composer-style symbolic music generation, especially for data-scarce styles.
Contribution
It introduces a novel two-stage training paradigm combining broad pre-training and style-specific fine-tuning with adapters for composer-style music generation.
Findings
Outperforms baselines in style accuracy and musicality
Enhances the model's ability to capture specific composer styles
Demonstrates effective transfer of general music knowledge to style mastery
Abstract
Despite progress in controllable symbolic music generation, data scarcity remains a challenge for certain control modalities. Composer-style music generation is a prime example, as only a few pieces per composer are available, limiting the modeling of both styles and fundamental music elements (e.g., melody, chord, rhythm). In this paper, we investigate how general music knowledge learned from a broad corpus can enhance the mastery of specific composer styles, with a focus on piano piece generation. Our approach follows a two-stage training paradigm. First, we pre-train a REMI-based music generation model on a large corpus of pop, folk, and classical music. Then, we fine-tune it on a small, human-verified dataset from four renowned composers, namely Bach, Mozart, Beethoven, and Chopin, using a lightweight adapter module to condition the model on style indicators. To evaluate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAdapter · Focus
