A Domain-Knowledge-Inspired Music Embedding Space and a Novel Attention Mechanism for Symbolic Music Modeling
Z. Guo, J. Kang, D. Herremans

TL;DR
This paper introduces a musically meaningful embedding space and a novel attention mechanism for symbolic music modeling, leading to improved performance and more coherent music generation compared to existing transformer models.
Contribution
It proposes the Fundamental Music Embedding (FME) and RIPO attention, integrating domain knowledge into transformer architectures for symbolic music, which enhances modeling and generation quality.
Findings
RIPO transformer outperforms state-of-the-art models in melody completion.
Generated music shows reduced degeneration and higher quality.
The approach effectively captures both absolute and relative musical attributes.
Abstract
Following the success of the transformer architecture in the natural language domain, transformer-like architectures have been widely applied to the domain of symbolic music recently. Symbolic music and text, however, are two different modalities. Symbolic music contains multiple attributes, both absolute attributes (e.g., pitch) and relative attributes (e.g., pitch interval). These relative attributes shape human perception of musical motifs. These important relative attributes, however, are mostly ignored in existing symbolic music modeling methods with the main reason being the lack of a musically-meaningful embedding space where both the absolute and relative embeddings of the symbolic music tokens can be efficiently represented. In this paper, we propose the Fundamental Music Embedding (FME) for symbolic music based on a bias-adjusted sinusoidal encoding within which both the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception
