Impact of time and note duration tokenizations on deep learning symbolic music modeling
Nathan Fradet, Nicolas Gutowski, Fabien Chhel, Jean-Pierre Briot

TL;DR
This paper investigates how different tokenization strategies, especially time and note duration representations, affect the performance of deep learning models in symbolic music tasks, highlighting the importance of explicit information for various applications.
Contribution
It provides a comparative analysis of tokenization methods in symbolic music modeling, emphasizing the impact of explicit time and note duration representations on model performance.
Findings
Explicit tokenization improves task-specific performance.
Time and note duration representations influence music classification and generation.
Tokenization choice affects model reasoning and explicit information capture.
Abstract
Symbolic music is widely used in various deep learning tasks, including generation, transcription, synthesis, and Music Information Retrieval (MIR). It is mostly employed with discrete models like Transformers, which require music to be tokenized, i.e., formatted into sequences of distinct elements called tokens. Tokenization can be performed in different ways. As Transformer can struggle at reasoning, but capture more easily explicit information, it is important to study how the way the information is represented for such model impact their performances. In this work, we analyze the common tokenization methods and experiment with time and note duration representations. We compare the performances of these two impactful criteria on several tasks, including composer and emotion classification, music generation, and sequence representation learning. We demonstrate that explicit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Byte Pair Encoding · Linear Layer · Label Smoothing · Residual Connection · Adam · Absolute Position Encodings · Layer Normalization
