A Mapping Strategy for Interacting with Latent Audio Synthesis Using Artistic Materials
Shuoyang Zheng, Anna Xamb\'o Sed\'o, and Nick Bryan-Kinns

TL;DR
This paper introduces a novel mapping strategy that enables artistic control of audio synthesis models through visual sketches, leveraging unsupervised feature learning to connect human input with generative AI's latent spaces.
Contribution
It presents a new method for translating high-dimensional sensor data into control signals for deep generative models, demonstrated through a visual sketch to audio synthesis system.
Findings
Successful control of audio synthesis via visual sketches
Use of unsupervised feature learning for mapping control spaces
Discussion on implications for XAI in artistic contexts
Abstract
This paper presents a mapping strategy for interacting with the latent spaces of generative AI models. Our approach involves using unsupervised feature learning to encode a human control space and mapping it to an audio synthesis model's latent space. To demonstrate how this mapping strategy can turn high-dimensional sensor data into control mechanisms of a deep generative model, we present a proof-of-concept system that uses visual sketches to control an audio synthesis model. We draw on emerging discourses in XAIxArts to discuss how this approach can contribute to XAI in artistic and creative contexts, we also discuss its current limitations and propose future research directions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing
