Pay (Cross) Attention to the Melody: Curriculum Masking for Single-Encoder Melodic Harmonization
Maximos Kaliakatsos-Papakostas, Dimos Makris, Konstantinos Soiledis, Konstantinos-Theodoros Tsamis, Vassilis Katsouros, Emilios Cambouropoulos

TL;DR
This paper introduces a novel training curriculum called FF for single-encoder melodic harmonization models, significantly improving melody-harmony interaction and out-of-domain performance in computational music generation.
Contribution
The paper proposes the FF curriculum, a progressive unmasking strategy that enhances melody-harmony attention in transformer models for melodic harmonization.
Findings
FF curriculum outperforms prior methods across multiple metrics
Quarter-note quantization and pitch-class representations improve performance
Models show strong adaptability to out-of-domain melodic inputs
Abstract
Melodic harmonization, the task of generating harmonic accompaniments for a given melody, remains a central challenge in computational music generation. Recent single encoder transformer approaches have framed harmonization as a masked sequence modeling problem, but existing training curricula inspired by discrete diffusion often result in weak (cross) attention between melody and harmony. This leads to limited exploitation of melodic cues, particularly in out-of-domain contexts. In this work, we introduce a training curriculum, FF (full-to-full), which keeps all harmony tokens masked for several training steps before progressively unmasking entire sequences during training to strengthen melody-harmony interactions. We systematically evaluate this approach against prior curricula across multiple experimental axes, including temporal quantization (quarter vs. sixteenth note), bar-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing · Neuroscience and Music Perception
