Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation
Hao-Wen Dong, Yi-Hsuan Yang

TL;DR
This paper introduces a convolutional GAN model with binary neurons that directly generates binary-valued piano-rolls for polyphonic music, eliminating the need for post-processing and improving quality.
Contribution
It proposes a novel generator refiner network with binary neurons trained in two stages, achieving better binary music generation results than traditional thresholding methods.
Findings
Binary neurons outperform thresholding and Bernoulli sampling.
Deterministic binary neurons yield better subjective and objective results.
The model directly produces binary piano-rolls without post-processing.
Abstract
It has been shown recently that deep convolutional generative adversarial networks (GANs) can learn to generate music in the form of piano-rolls, which represent music by binary-valued time-pitch matrices. However, existing models can only generate real-valued piano-rolls and require further post-processing, such as hard thresholding (HT) or Bernoulli sampling (BS), to obtain the final binary-valued results. In this paper, we study whether we can have a convolutional GAN model that directly creates binary-valued piano-rolls by using binary neurons. Specifically, we propose to append to the generator an additional refiner network, which uses binary neurons at the output layer. The whole network is trained in two stages. Firstly, the generator and the discriminator are pretrained. Then, the refiner network is trained along with the discriminator to learn to binarize the real-valued…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
MethodsLayer Normalization · WGAN-GP Loss · HuMan(Expedia)||How do I get a human at Expedia? · Batch Normalization · Wasserstein GAN (Gradient Penalty) · Convolution · Dogecoin Customer Service Number +1-833-534-1729
