MIDI-Sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN   networks for Symbolic Single-track Music Generation

Xia Liang; Junmin Wu; Yan Yin

arXiv:1907.01607·eess.AS·July 5, 2019·1 cites

MIDI-Sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN networks for Symbolic Single-track Music Generation

Xia Liang, Junmin Wu, Yan Yin

PDF

Open Access

TL;DR

MIDI-Sandwich introduces a hierarchical multi-model VAE-GAN framework that generates musically coherent single-track melodies with structure and direction by modeling relationships between musical bars.

Contribution

The paper presents a novel hierarchical multi-model VAE-GAN architecture that incorporates musical knowledge to improve the structure and coherence of generated melodies.

Findings

01

Generates longer, musically structured melodies (17x8 beats) compared to existing models.

02

Outperforms previous models in producing coherent musical sequences.

03

Validated on Nottingham dataset with positive quality assessments.

Abstract

Most existing neural network models for music generation explore how to generate music bars, then directly splice the music bars into a song. However, these methods do not explore the relationship between the bars, and the connected song as a whole has no musical form structure and sense of musical direction. To address this issue, we propose a Multi-model Multi-task Hierarchical Conditional VAE-GAN (Variational Autoencoder-Generative adversarial networks) networks, named MIDI-Sandwich, which combines musical knowledge, such as musical form, tonic, and melodic motion. The MIDI-Sandwich has two submodels: Hierarchical Conditional Variational Autoencoder (HCVAE) and Hierarchical Conditional Generative Adversarial Network (HCGAN). The HCVAE uses hierarchical structure. The underlying layer of HCVAE uses Local Conditional Variational Autoencoder (L-CVAE) to generate a music bar which is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Generative Adversarial Networks and Image Synthesis

MethodsSolana Customer Service Number +1-833-534-1729