Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining
TJ Tsai, Kevin Ji

TL;DR
This paper introduces a novel approach for composer style classification of piano sheet music images by leveraging language model pretraining on unlabeled data, significantly improving accuracy over previous methods.
Contribution
It recasts sheet music classification as a language modeling problem and demonstrates the effectiveness of pretrained transformer models trained on unlabeled data.
Findings
Pretraining with language models improves classification accuracy.
Transformer architectures outperform CNN and LSTM models.
Pretrained GPT-2 achieves 70% accuracy on 9-way classification.
Abstract
This paper studies composer style classification of piano sheet music images. Previous approaches to the composer classification task have been limited by a scarcity of data. We address this issue in two ways: (1) we recast the problem to be based on raw sheet music images rather than a symbolic music format, and (2) we propose an approach that can be trained on unlabeled data. Our approach first converts the sheet music image into a sequence of musical "words" based on the bootleg feature representation, and then feeds the sequence into a text classifier. We show that it is possible to significantly improve classifier performance by first training a language model on a set of unlabeled data, initializing the classifier with the pretrained language model weights, and then finetuning the classifier on a small amount of labeled data. We train AWD-LSTM, GPT-2, and RoBERTa language models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Diverse Musicological Studies
MethodsLinear Layer · Cosine Annealing · Dense Connections · WordPiece · Residual Connection · Byte Pair Encoding · Attention Is All You Need · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam · Variational Dropout
