Musical Attention Transformer: Music Generation Using a Music-Specific Attention Model
Shinnosuke Taksuka, Hideo Mukai

TL;DR
This paper introduces Musical Attention, a novel attention mechanism for Transformers that incorporates musical metadata to improve the coherence, diversity, and naturalness of generated music.
Contribution
The study proposes a music-specific attention model that explicitly leverages musical structure and metadata, enhancing Transformer-based music generation quality.
Findings
Outperforms prior methods like Full and Strided Attention in coherence and diversity.
Reduces note repetition and improves harmonic consistency.
Generates more natural and expressive musical compositions.
Abstract
This study aims to enhance the quality of music generation using Transformers by incorporating meta-information. While Transformer-based approaches are effective at capturing long-term dependencies in musical compositions, the music they generate often suffers from issues such as excessive repetition or duplication of notes, leading to unnatural melodies. To address these limitations, we propose Musical Attention, a mechanism that incorporates meta-information such as bar numbers, key, signatures, and tempos into the attention process. Musical Attention explicitly leverages both the structural properties of music and its associated metadata, enabling the Transformer's attention mechanism to operate more effectively and thereby improving the quality of the generated output. In our framework, each musical note is represented as a combination of five events-pitch, bar number, onset,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
