MMM : Exploring Conditional Multi-Track Music Generation with the Transformer
Jeff Ens, Philippe Pasquier

TL;DR
This paper introduces MMM, a Transformer-based system for multi-track music generation that uses separate track sequences and offers interactive control over musical features.
Contribution
It presents a novel multi-track music generation method using Transformers with track-specific sequences and interactive control features.
Findings
Effective multi-track music generation demonstrated.
Interactive control over instrumentation and note density achieved.
Handles long-term dependencies in music sequences.
Abstract
We propose the Multi-Track Music Machine (MMM), a generative system based on the Transformer architecture that is capable of generating multi-track music. In contrast to previous work, which represents musical material as a single time-ordered sequence, where the musical events corresponding to different tracks are interleaved, we create a time-ordered sequence of musical events for each track and concatenate several tracks into a single sequence. This takes advantage of the Transformer's attention-mechanism, which can adeptly handle long-term dependencies. We explore how various representations can offer the user a high degree of control at generation time, providing an interactive demo that accommodates track-level and bar-level inpainting, and offers control over track instrumentation and note density.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing · Generative Adversarial Networks and Image Synthesis
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Multi-Head Attention · Layer Normalization · Attention Is All You Need · Byte Pair Encoding · Dropout · Label Smoothing · Residual Connection
