Audio Latent Space Cartography
Nicolas Jonason, Bob L.T. Sturm

TL;DR
This paper presents a method for visualizing audio latent spaces through an audio-to-image pipeline, enhancing interpretability and demonstrating results on the NSynth dataset with an accessible web demo.
Contribution
It introduces a novel approach to visualize audio latent spaces using image generation, aiding understanding of audio representations.
Findings
Effective visualizations of audio latent spaces created
Demonstrated on NSynth dataset with promising results
Web demo available for interactive exploration
Abstract
We explore the generation of visualisations of audio latent spaces using an audio-to-image generation pipeline. We believe this can help with the interpretability of audio latent spaces. We demonstrate a variety of results on the NSynth dataset. A web demo is available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
