Learning to Fuse Music Genres with Generative Adversarial Dual Learning
Zhiqian Chen, Chih-Wei Wu, Yen-Cheng Lu, Alexander Lerch and, Chang-Tien Lu

TL;DR
FusionGAN introduces a novel music genre fusion framework that combines generative adversarial networks and dual learning, utilizing Wasserstein distance for effective genre merging and style integration.
Contribution
The paper presents a new dual learning extension for GANs that effectively fuses music genres using Wasserstein metrics, addressing domain difference quantification and gradient issues.
Findings
Successfully merges two music genres in experiments
Uses Wasserstein distance to measure domain differences
Effective style integration demonstrated on public datasets
Abstract
FusionGAN is a novel genre fusion framework for music generation that integrates the strengths of generative adversarial networks and dual learning. In particular, the proposed method offers a dual learning extension that can effectively integrate the styles of the given domains. To efficiently quantify the difference among diverse domains and avoid the vanishing gradient issue, FusionGAN provides a Wasserstein based metric to approximate the distance between the target domain and the existing domains. Adopting the Wasserstein distance, a new domain is created by combining the patterns of the existing domains using adversarial learning. Experimental results on public music datasets demonstrated that our approach could effectively merge two genres.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Generative Adversarial Networks and Image Synthesis
