Excitation-based Voice Quality Analysis and Modification

Thomas Drugman; Thierry Dutoit; Baris Bozkurt

arXiv:2001.00582·cs.SD·January 6, 2020·1 cites

Excitation-based Voice Quality Analysis and Modification

Thomas Drugman, Thierry Dutoit, Baris Bozkurt

PDF

Open Access

TL;DR

This paper analyzes excitation features for different voice qualities, derives modification rules, and applies them to improve voice quality in speech synthesis, demonstrating effective transformations without quality loss.

Contribution

It introduces a novel analysis of excitation features for voice qualities and develops a transformation system for voice quality modification in speech synthesis.

Findings

01

Significant excitation differences among voice qualities

02

Effective voice quality transformation in synthesis

03

Maintained speech quality after modification

Abstract

This paper investigates the differences occuring in the excitation for different voice qualities. Its goal is two-fold. First a large corpus containing three voice qualities (modal, soft and loud) uttered by the same speaker is analyzed and significant differences in characteristics extracted from the excitation are observed. Secondly rules of modification derived from the analysis are used to build a voice quality transformation system applied as a post-process to HMM-based speech synthesis. The system is shown to effectively achieve the transformations while maintaining the delivered quality.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders