Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System

Yukiya Hono; Kei Hashimoto; Keiichiro Oura; Yoshihiko Nankaku; Keiichi; Tokuda

arXiv:2108.02776·eess.AS·September 28, 2021

Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System

Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi, Tokuda

PDF

1 Repo

TL;DR

Sinsy is a DNN-based singing voice synthesis system that improves pitch accuracy, vibrato naturalness, and timing by integrating advanced modeling techniques and a neural vocoder, outperforming traditional methods in quality.

Contribution

The paper introduces a novel DNN-based SVS system with improved pitch, vibrato, and timing modeling, incorporating PeriodNet and automatic pitch correction for enhanced synthesis quality.

Findings

01

Better natural vibrato and timing in synthesized singing voices.

02

Higher mean opinion scores in subjective evaluations.

03

Effective pitch correction even with out-of-tune training data.

Abstract

This paper presents Sinsy, a deep neural network (DNN)-based singing voice synthesis (SVS) system. In recent years, DNNs have been utilized in statistical parametric SVS systems, and DNN-based SVS systems have demonstrated better performance than conventional hidden Markov model-based ones. SVS systems are required to synthesize a singing voice with pitch and timing that strictly follow a given musical score. Additionally, singing expressions that are not described on the musical score, such as vibrato and timing fluctuations, should be reproduced. The proposed system is composed of four modules: a time-lag model, a duration model, an acoustic model, and a vocoder, and singing voices can be synthesized taking these characteristics of singing voices into account. To better model a singing voice, the proposed system incorporates improved approaches to modeling pitch and vibrato and better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

r9y9/nnsvs
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.