Synthesising Handwritten Music with GANs: A Comprehensive Evaluation of CycleWGAN, ProGAN, and DCGAN
Elona Shatri, Kalikidhar Palavala, George Fazekas

TL;DR
This paper evaluates three GAN models for synthesising realistic handwritten music sheets to improve Optical Music Recognition systems, with CycleWGAN showing the best performance in quality and diversity.
Contribution
It provides a comprehensive comparison of GAN architectures for handwritten music synthesis and introduces CycleWGAN as a superior model for this task.
Findings
CycleWGAN outperforms DCGAN and ProGAN in quality and diversity
CycleWGAN achieves an FID score of 41.87, IS of 2.29, and KID of 0.05
Synthesised music sheets can enhance OMR system performance.
Abstract
The generation of handwritten music sheets is a crucial step toward enhancing Optical Music Recognition (OMR) systems, which rely on large and diverse datasets for optimal performance. However, handwritten music sheets, often found in archives, present challenges for digitisation due to their fragility, varied handwriting styles, and image quality. This paper addresses the data scarcity problem by applying Generative Adversarial Networks (GANs) to synthesise realistic handwritten music sheets. We provide a comprehensive evaluation of three GAN models - DCGAN, ProGAN, and CycleWGAN - comparing their ability to generate diverse and high-quality handwritten music images. The proposed CycleWGAN model, which enhances style transfer and training stability, significantly outperforms DCGAN and ProGAN in both qualitative and quantitative evaluations. CycleWGAN achieves superior performance, with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Human Motion and Animation · Interactive and Immersive Displays
MethodsLocal Response Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · HuMan(Expedia)||How do I get a human at Expedia? · Convolution · Batch Normalization · Deep Convolutional GAN · WGAN-GP Loss · Dense Connections · Progressively Growing GAN
