ConchShell: A Generative Adversarial Networks that Turns Pictures into   Piano Music

Wanpeng Fan; Yuanzhi Su; Yuxin Huang

arXiv:2210.05076·cs.SD·October 12, 2022·1 cites

ConchShell: A Generative Adversarial Networks that Turns Pictures into Piano Music

Wanpeng Fan, Yuanzhi Su, Yuxin Huang

PDF

Open Access 1 Repo

TL;DR

ConchShell is a novel multi-modal GAN that generates piano music from images using a new temporal image feature representation, with potential applications in entertainment and virtual environments.

Contribution

The paper introduces ConchShell, a GAN framework with a new TCNN image feature method, and releases the BOPD dataset for multimodal image-to-music research.

Findings

01

Successfully generates piano music matching image context

02

Introduces a novel TCNN feature representation for images

03

Provides a new dataset supporting multimodal research

Abstract

We present ConchShell, a multi-modal generative adversarial framework that takes pictures as input to the network and generates piano music samples that match the picture context. Inspired by I3D, we introduce a novel image feature representation method: time-convolutional neural network (TCNN), which is used to forge features for images in the temporal dimension. Although our image data consists of only six categories, our proposed framework will be innovative and commercially meaningful. The project will provide technical ideas for work such as 3D game voice overs, short-video soundtracks, and real-time generation of metaverse background music.We have also released a new dataset, the Beach-Ocean-Piano Dataset (BOPD) 1, which contains more than 3,000 images and more than 1,500 piano pieces. This dataset will support multimodal image-to-music research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DIO385/ConchShell
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing · Music Technology and Sound Studies