Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling
Yungang Yi, Weihua Li, Matthew Kuo, Catherine Shi, Quan Bai

TL;DR
This paper introduces DSMR, a resource-efficient recurrent attention method for modeling long musical sequences, achieving comparable performance to full-memory models with significantly less memory usage.
Contribution
The paper presents Depth-Structured Music Recurrence (DSMR), a novel training design that enables end-to-end learning of full compositions with limited recurrent memory, improving efficiency in symbolic music modeling.
Findings
DSMR matches full-memory models in perplexity on MAESTRO dataset.
DSMR reduces GPU memory usage by approximately 59%.
DSMR achieves roughly 36% higher throughput.
Abstract
Long-context modeling is essential for symbolic music generation, since motif repetition and developmental variation can span thousands of musical events, yet practical workflows frequently rely on resource-limited hardware. We introduce Depth-Structured Music Recurrence (DSMR), a training-time design that learns from complete compositions end to end by streaming each piece left-to-right with stateful recurrent attention and distributing layer-wise memory horizons under a fixed recurrent-state budget. Our main instantiation, two-scale DSMR, assigns long history windows to lower layers and a uniform short window to the remaining layers. On the MAESTRO piano performance dataset, two-scale DSMR matches a full-memory recurrent reference in perplexity (5.96 vs. 5.98) while using approximately 59% less GPU memory and achieving roughly 36% higher throughput. Variant analyses further show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing · Neuroscience and Music Perception
