Mandarin Singing Voice Synthesis Based on Harmonic Plus Noise Model and   Singing Expression Analysis

Ju-Chiang Wang; Hung-Yan Gu; Hsin-Min Wang

arXiv:1502.04300·cs.SD·February 17, 2015·2 cites

Mandarin Singing Voice Synthesis Based on Harmonic Plus Noise Model and Singing Expression Analysis

Ju-Chiang Wang, Hung-Yan Gu, Hsin-Min Wang

PDF

Open Access

TL;DR

This paper presents a Mandarin singing voice synthesis system that incorporates expressive factors extracted from real singing signals, significantly enhancing naturalness and expressiveness in synthesized singing.

Contribution

It introduces a semi-automatic analysis method for extracting expressive parameters and integrates them into a harmonic plus noise model-based SVS system for Mandarin Chinese.

Findings

01

Improved perceptual naturalness and expressiveness in synthesized singing.

02

Effective extraction of expressive parameters from real singing signals.

03

Successful one-to-one mapping of real singing expression to synthesis controls.

Abstract

The purpose of this study is to investigate how humans interpret musical scores expressively, and then design machines that sing like humans. We consider six factors that have a strong influence on the expression of human singing. The factors are related to the acoustic, phonetic, and musical features of a real singing signal. Given real singing voices recorded following the MIDI scores and lyrics, our analysis module can extract the expression parameters from the real singing signals semi-automatically. The expression parameters are used to control the singing voice synthesis (SVS) system for Mandarin Chinese, which is based on the harmonic plus noise model (HNM). The results of perceptual experiments show that integrating the expression factors into the SVS system yields a notable improvement in perceptual naturalness, clearness, and expressiveness. By one-to-one mapping of the real…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis