Music Generation based on Generative Adversarial Networks with Transformer
Ziyi Jiang, Ruoxue Wu, Zhenghan Chen, Xiaoxuan Liang

TL;DR
This paper introduces a GAN-based approach with Transformer models for music generation, addressing long-sequence quality issues by combining adversarial training with likelihood methods, resulting in improved sample quality.
Contribution
It proposes integrating a pre-trained Span-BERT discriminator and Gumbel-Softmax for stable adversarial training of Transformer-based music models, enhancing long-sequence generation quality.
Findings
Outperforms likelihood-only models in human evaluations
Uses a Span-BERT discriminator for stable GAN training
Introduces a novel discriminative metric for assessment
Abstract
Autoregressive models based on Transformers have become the prevailing approach for generating music compositions that exhibit comprehensive musical structure. These models are typically trained by minimizing the negative log-likelihood (NLL) of the observed sequence in an autoregressive manner. However, when generating long sequences, the quality of samples from these models tends to significantly deteriorate due to exposure bias. To address this issue, we leverage classifiers trained to differentiate between real and sampled sequences to identify these failures. This observation motivates our exploration of adversarial losses as a complement to the NLL objective. We employ a pre-trained Span-BERT model as the discriminator in the Generative Adversarial Network (GAN) framework, which enhances training stability in our experiments. To optimize discrete sequences within the GAN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing · Model Reduction and Neural Networks
