F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music   Generation

Manvi Agarwal (IP Paris; LTCI; IDS); Changhong Wang (LTCI); Gael; Richard (S2A; IDS)

arXiv:2502.10491·cs.SD·February 18, 2025

F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music Generation

Manvi Agarwal (IP Paris, LTCI, IDS), Changhong Wang (LTCI), Gael, Richard (S2A, IDS)

PDF

TL;DR

F-StrIPE introduces a linear-complexity, structure-informed positional encoding for Transformers, enhancing symbolic music generation by efficiently incorporating musical structure priors.

Contribution

It proposes F-StrIPE, a novel PE scheme that generalizes SPE using kernel approximation, enabling efficient, structure-aware music modeling with linear complexity.

Findings

01

F-StrIPE improves melody harmonization quality.

02

It reduces computational cost compared to quadratic methods.

03

The approach effectively encodes musical structure in Transformer models.

Abstract

While music remains a challenging domain for generative models like Transformers, recent progress has been made by exploiting suitable musically-informed priors. One technique to leverage information about musical structure in Transformers is inserting such knowledge into the positional encoding (PE) module. However, Transformers carry a quadratic cost in sequence length. In this paper, we propose F-StrIPE, a structure-informed PE scheme that works in linear complexity. Using existing kernel approximation techniques based on random features, we show that F-StrIPE is a generalization of Stochastic Positional Encoding (SPE). We illustrate the empirical merits of F-StrIPE using melody harmonization for symbolic music.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.