# A Vocoder Based Method For Singing Voice Extraction

**Authors:** Pritish Chandna, Merlijn Blaauw, Jordi Bonada, Emilia Gomez

arXiv: 1903.07554 · 2020-02-13

## TL;DR

This paper introduces a convolutional neural network-based vocoder method for extracting singing voices from musical mixtures, achieving high-quality separation without interference from background instruments.

## Contribution

The paper proposes a novel neural network architecture utilizing skip, residual, and dilated convolutions for vocoder parameter estimation in singing voice extraction.

## Key findings

- Achieves competitive audio quality metrics
- Outperforms NMF-based separation systems
- Subjective evaluations favor the proposed method

## Abstract

This paper presents a novel method for extracting the vocal track from a musical mixture. The musical mixture consists of a singing voice and a backing track which may comprise of various instruments. We use a convolutional network with skip and residual connections as well as dilated convolutions to estimate vocoder parameters, given the spectrogram of an input mixture. The estimated parameters are then used to synthesize the vocal track, without any interference from the backing track. We evaluate our system, through objective metrics pertinent to audio quality and interference from background sources, and via a comparative subjective evaluation. We use open-source source separation systems based on Non-negative Matrix Factorization (NMFs) and Deep Learning methods as benchmarks for our system and discuss future applications for this particular algorithm.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.07554/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1903.07554/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1903.07554/full.md

---
Source: https://tomesphere.com/paper/1903.07554