Convolutional Generative Adversarial Networks with Binary Neurons for   Polyphonic Music Generation

Hao-Wen Dong; Yi-Hsuan Yang

arXiv:1804.09399·cs.LG·October 9, 2018·33 cites

Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation

Hao-Wen Dong, Yi-Hsuan Yang

PDF

Open Access 3 Repos

TL;DR

This paper introduces a convolutional GAN model with binary neurons that directly generates binary-valued piano-rolls for polyphonic music, eliminating the need for post-processing and improving quality.

Contribution

It proposes a novel generator refiner network with binary neurons trained in two stages, achieving better binary music generation results than traditional thresholding methods.

Findings

01

Binary neurons outperform thresholding and Bernoulli sampling.

02

Deterministic binary neurons yield better subjective and objective results.

03

The model directly produces binary piano-rolls without post-processing.

Abstract

It has been shown recently that deep convolutional generative adversarial networks (GANs) can learn to generate music in the form of piano-rolls, which represent music by binary-valued time-pitch matrices. However, existing models can only generate real-valued piano-rolls and require further post-processing, such as hard thresholding (HT) or Bernoulli sampling (BS), to obtain the final binary-valued results. In this paper, we study whether we can have a convolutional GAN model that directly creates binary-valued piano-rolls by using binary neurons. Specifically, we propose to append to the generator an additional refiner network, which uses binary neurons at the output layer. The whole network is trained in two stages. Firstly, the generator and the discriminator are pretrained. Then, the refiner network is trained along with the discriminator to learn to binarize the real-valued…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing

MethodsLayer Normalization · WGAN-GP Loss · HuMan(Expedia)||How do I get a human at Expedia? · Batch Normalization · Wasserstein GAN (Gradient Penalty) · Convolution · Dogecoin Customer Service Number +1-833-534-1729