Learning the Beauty in Songs: Neural Singing Voice Beautifier

Jinglin Liu; Chengxi Li; Yi Ren; Zhiying Zhu; Zhou Zhao

arXiv:2202.13277·eess.AS·March 3, 2022

Learning the Beauty in Songs: Neural Singing Voice Beautifier

Jinglin Liu, Chengxi Li, Yi Ren, Zhiying Zhu, Zhou Zhao

PDF

Open Access 4 Repos

TL;DR

This paper introduces Neural Singing Voice Beautifier (NSVB), a generative model that enhances amateur singing by improving pitch and vocal tone using novel time-warping and latent-mapping techniques, validated on Chinese and English songs.

Contribution

The paper presents the first generative model for singing voice beautifying, incorporating a novel time-warping method and a latent-mapping algorithm, along with a new parallel singing dataset.

Findings

01

Effective in improving vocal tone and intonation

02

Works on both Chinese and English songs

03

Outperforms existing methods in objective and subjective metrics

Abstract

We are interested in a novel task, singing voice beautifying (SVB). Given the singing voice of an amateur singer, SVB aims to improve the intonation and vocal tone of the voice, while keeping the content and vocal timbre. Current automatic pitch correction techniques are immature, and most of them are restricted to intonation but ignore the overall aesthetic quality. Hence, we introduce Neural Singing Voice Beautifier (NSVB), the first generative model to solve the SVB task, which adopts a conditional variational autoencoder as the backbone and learns the latent representations of vocal tone. In NSVB, we propose a novel time-warping approach for pitch correction: Shape-Aware Dynamic Time Warping (SADTW), which ameliorates the robustness of existing time-warping approaches, to synchronize the amateur recording with the template pitch curve. Furthermore, we propose a latent-mapping…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Time Series Analysis and Forecasting