Source Separation of Multi-source Raw Music using a Residual Quantized   Variational Autoencoder

Leonardo Berti

arXiv:2408.07020·cs.SD·August 14, 2024

Source Separation of Multi-source Raw Music using a Residual Quantized Variational Autoencoder

Leonardo Berti

PDF

Open Access 1 Repo

TL;DR

This paper introduces a neural audio codec based on residual quantized variational autoencoders for musical source separation, achieving near state-of-the-art results with reduced computational requirements.

Contribution

The paper presents a novel residual quantized variational autoencoder architecture for source separation, trained on multi-track music data, with improved efficiency and competitive performance.

Findings

01

Achieves near state-of-the-art separation results

02

Requires less computational power than comparable models

03

Code is publicly available for reproducibility

Abstract

I developed a neural audio codec model based on the residual quantized variational autoencoder architecture. I train the model on the Slakh2100 dataset, a standard dataset for musical source separation, composed of multi-track audio. The model can separate audio sources, achieving almost SoTA results with much less computing power. The code is publicly available at github.com/LeonardoBerti00/Source-Separation-of-Multi-source-Music-using-Residual-Quantizad-Variational-Autoencoder

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leonardoberti00/source-separation-of-multi-source-music-using-residual-quantizad-variational-autoencoder
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies