Learning source-aware representations of music in a discrete latent   space

Jinsung Kim; Yeong-Seok Jeong; Woosung Choi; Jaehwa Chung; Soonyoung; Jung

arXiv:2111.13321·eess.AS·November 29, 2021

Learning source-aware representations of music in a discrete latent space

Jinsung Kim, Yeong-Seok Jeong, Woosung Choi, Jaehwa Chung, Soonyoung, Jung

PDF

Open Access

TL;DR

This paper introduces a novel VQ-VAE-based method to learn human-readable, source-aware music representations in a discrete latent space, enabling easier analysis, editing, and generation of musical components like basslines.

Contribution

The paper presents a new approach to encode music into a structured, source-aware discrete latent space using VQ-VAE, facilitating human interpretability and manipulation.

Findings

01

Latent representations are human-readable and source-aware.

02

Able to generate basslines by estimating discrete latent vectors.

03

Demonstrates improved interpretability of music representations.

Abstract

In recent years, neural network based methods have been proposed as a method that cangenerate representations from music, but they are not human readable and hardly analyzable oreditable by a human. To address this issue, we propose a novel method to learn source-awarelatent representations of music through Vector-Quantized Variational Auto-Encoder(VQ-VAE).We train our VQ-VAE to encode an input mixture into a tensor of integers in a discrete latentspace, and design them to have a decomposed structure which allows humans to manipulatethe latent vector in a source-aware manner. This paper also shows that we can generate basslines by estimating latent vectors in a discrete space.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies

MethodsVQ-VAE