Upmixing via style transfer: a variational autoencoder for disentangling spatial images and musical content
Haici Yang, Sanna Wager, Spencer Russell, Mike Luo, Minje Kim, Wontak, Kim

TL;DR
This paper introduces a variational autoencoder model that disentangles spatial images from musical content in stereo-to-multichannel upmixing, enabling flexible spatial control and transfer of spatial images across songs.
Contribution
A novel VAE-based approach that learns invariant spatial image representations, allowing for transfer and control of spatial panning in multichannel music.
Findings
Model effectively separates spatial images from musical content.
Enables transfer of spatial images between songs.
Achieves interactive panning through generative modeling.
Abstract
In the stereo-to-multichannel upmixing problem for music, one of the main tasks is to set the directionality of the instrument sources in the multichannel rendering results. In this paper, we propose a modified variational autoencoder model that learns a latent space to describe the spatial images in multichannel music. We seek to disentangle the spatial images and music content, so the learned latent variables are invariant to the music. At test time, we use the latent variables to control the panning of sources. We propose two upmixing use cases: transferring the spatial images from one song to another and blind panning based on the generative model. We report objective and subjective evaluation results to empirically show that our model captures spatial images separately from music content and achieves transfer-based interactive panning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Music Technology and Sound Studies · Computer Graphics and Visualization Techniques
