An Comparative Analysis of Different Pitch and Metrical Grid Encoding Methods in the Task of Sequential Music Generation
Yuqiang Li, Shengchen Li, George Fazekas

TL;DR
This study compares various pitch and metrical encoding methods in symbolic music generation, revealing that class-octave encoding outperforms MIDI encoding and that finer rhythmic grids enhance rhythmic quality but risk overfitting.
Contribution
It provides an integrated analysis of pitch and meter encoding methods, highlighting their effects on model performance and offering insights for feature engineering in music generation.
Findings
Class-octave encoding outperforms MIDI in pitch metrics.
Finer rhythmic grids improve rhythmic quality.
Overfitting occurs with smaller networks and lower embedding dimensions.
Abstract
Pitch and meter are two fundamental music features for symbolic music generation tasks, where researchers usually choose different encoding methods depending on specific goals. However, the advantages and drawbacks of different encoding methods have not been frequently discussed. This paper presents a integrated analysis of the influence of two low-level feature, pitch and meter, on the performance of a token-based sequential music generation model. First, the commonly used MIDI number encoding and a less used class-octave encoding are compared. Second, an dense intra-bar metric grid is imposed to the encoded sequence as auxiliary features. Different complexity and resolutions of the metric grid are compared. For complexity, the single token approach and the multiple token approach are compared; for grid resolution, 0 (ablation), 1 (bar-level), 4 (downbeat-level) 12, (8th-triplet-level)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception
MethodsAttention Is All You Need · Test · *Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Softmax · Adaptive Softmax · Layer Normalization · Adaptive Input Representations · Adam · Multi-Head Attention
