Generating Music with Structure Using Self-Similarity as Attention
Sophia Hager, Kathleen Hablutzel, and Katherine M. Kinnaird

TL;DR
This paper introduces a novel attention mechanism that uses self-similarity matrices to impose structural templates on generated music, improving the replication of musical structure in AI-generated compositions.
Contribution
The paper presents a new attention layer leveraging user-supplied self-similarity matrices, integrated into a deep learning system for structured music generation, demonstrating improved structural replication.
Findings
Attention mechanism enhances structural fidelity in generated music
Model outperforms baseline without attention on unseen data
Significant improvement in replicating specific musical structures
Abstract
Despite the innovations in deep learning and generative AI, creating long term structure as well as the layers of repeated structure common in musical works remains an open challenge in music generation. We propose an attention layer that uses a novel approach applying user-supplied self-similarity matrices to previous time steps, and demonstrate it in our Similarity Incentivized Neural Generator (SING) system, a deep learning autonomous music generation system with two layers. The first is a vanilla Long Short Term Memory layer, and the second is the proposed attention layer. During generation, this attention mechanism imposes a suggested structure from a template piece on the generated music. We train SING on the MAESTRO dataset using a novel variable batching method, and compare its performance to the same model without the attention mechanism. The addition of our proposed attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing
MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training
