Improved symbolic drum style classification with grammar-based   hierarchical representations

L\'eo G\'er\'e (CNAM Paris; CEDRIC - VERTIGO); Philippe Rigaux (CEDRIC; - VERTIGO; CNAM Paris); Nicolas Audebert (CEDRIC - VERTIGO; CNAM; IGN,; LaSTIG)

arXiv:2407.17536·cs.SD·July 26, 2024·1 cites

Improved symbolic drum style classification with grammar-based hierarchical representations

L\'eo G\'er\'e (CNAM Paris, CEDRIC - VERTIGO), Philippe Rigaux (CEDRIC, - VERTIGO, CNAM Paris), Nicolas Audebert (CEDRIC - VERTIGO, CNAM, IGN,, LaSTIG)

PDF

Open Access

TL;DR

This paper introduces a grammar-based hierarchical representation of MIDI data that enhances deep learning models' ability to classify musical styles, especially drumming, by capturing high-level rhythmic information more effectively.

Contribution

It proposes a novel tree-based MIDI representation using a context-free grammar, improving style classification accuracy and efficiency over traditional tokenization methods.

Findings

01

Grammar-based representation outperforms generic tokenization.

02

Enhanced rhythmic encoding improves classification accuracy.

03

More compact and parameter-efficient model architecture.

Abstract

Deep learning models have become a critical tool for analysis and classification of musical data. These models operate either on the audio signal, e.g. waveform or spectrogram, or on a symbolic representation, such as MIDI. In the latter, musical information is often reduced to basic features, i.e. durations, pitches and velocities. Most existing works then rely on generic tokenization strategies from classical natural language processing, or matrix representations, e.g. piano roll. In this work, we evaluate how enriched representations of symbolic data can impact deep models, i.e. Transformers and RNN, for music style classification. In particular, we examine representations that explicitly incorporate musical information implicitly present in MIDI-like encodings, such as rhythmic organization, and show that they outperform generic tokenization strategies. We introduce a new tree-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing