F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music Generation
Manvi Agarwal (IP Paris, LTCI, IDS), Changhong Wang (LTCI), Gael, Richard (S2A, IDS)

TL;DR
F-StrIPE introduces a linear-complexity, structure-informed positional encoding for Transformers, enhancing symbolic music generation by efficiently incorporating musical structure priors.
Contribution
It proposes F-StrIPE, a novel PE scheme that generalizes SPE using kernel approximation, enabling efficient, structure-aware music modeling with linear complexity.
Findings
F-StrIPE improves melody harmonization quality.
It reduces computational cost compared to quadratic methods.
The approach effectively encodes musical structure in Transformer models.
Abstract
While music remains a challenging domain for generative models like Transformers, recent progress has been made by exploiting suitable musically-informed priors. One technique to leverage information about musical structure in Transformers is inserting such knowledge into the positional encoding (PE) module. However, Transformers carry a quadratic cost in sequence length. In this paper, we propose F-StrIPE, a structure-informed PE scheme that works in linear complexity. Using existing kernel approximation techniques based on random features, we show that F-StrIPE is a generalization of Stochastic Positional Encoding (SPE). We illustrate the empirical merits of F-StrIPE using melody harmonization for symbolic music.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
