The artificial synesthete: Image-melody translations with variational   autoencoders

Karl Wienand; Wolfgang M. Heckl

arXiv:2112.02953·cs.CV·December 7, 2021

The artificial synesthete: Image-melody translations with variational autoencoders

Karl Wienand, Wolfgang M. Heckl

PDF

Open Access

TL;DR

This paper introduces a neural network system that translates images into melodies and vice versa, creating an artificial synesthete that interprets visual and musical data through learned correspondences.

Contribution

It presents a novel neural network architecture combining autoencoders and translation networks to generate cross-modal representations between images and melodies.

Findings

01

Generates melodies inspired by images and images from music.

02

Demonstrates learned correspondences between visual and musical concepts.

03

Provides a new perspective on machine perception and interpretation.

Abstract

Abstract This project presents a system of neural networks to translate between images and melodies. Autoencoders compress the information in samples to abstract representation. A translation network learns a set of correspondences between musical and visual concepts from repeated joint exposure. The resulting "artificial synesthete" generates simple melodies inspired by images, and images from music. These are novel interpretation (not transposed data), expressing the machine' perception and understanding. Observing the work, one explores the machine's perception and thus, by contrast, one's own.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic Technology and Sound Studies