Art2Mus: Bridging Visual Arts and Music through Cross-Modal Generation

Ivan Rinaldi; Nicola Fanelli; Giovanna Castellano; Gennaro Vessio

arXiv:2410.04906·cs.MM·July 31, 2025

Art2Mus: Bridging Visual Arts and Music through Cross-Modal Generation

Ivan Rinaldi, Nicola Fanelli, Giovanna Castellano, Gennaro Vessio

PDF

Open Access 1 Repo

TL;DR

Art2Mus introduces a novel AI model that generates music from digitized artworks or text, extending cross-modal creativity beyond simple images and enabling new multimedia artistic applications.

Contribution

It extends the AudioLDM 2 architecture to create a model capable of generating music from complex digitized artworks, using curated datasets from ImageBind.

Findings

01

Successfully generates music aligned with input artworks

02

Demonstrates potential for multimedia art and interactive installations

03

Shows promising results in cross-modal creative applications

Abstract

Artificial Intelligence and generative models have revolutionized music creation, with many models leveraging textual or visual prompts for guidance. However, existing image-to-music models are limited to simple images, lacking the capability to generate music from complex digitized artworks. To address this gap, we introduce $A rt2 M us$ , a novel model designed to create music from digitized artworks or text inputs. $A rt2 M us$ extends the AudioLDM~2 architecture, a text-to-audio model, and employs our newly curated datasets, created via ImageBind, which pair digitized artworks with music. Experimental results demonstrate that $A rt2 M us$ can generate music that resonates with the input stimuli. These findings suggest promising applications in multimedia art, interactive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

justivanr/art2mus_
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic Technology and Sound Studies · Human Motion and Animation · Computer Graphics and Visualization Techniques