Excitation-based Voice Quality Analysis and Modification
Thomas Drugman, Thierry Dutoit, Baris Bozkurt

TL;DR
This paper analyzes excitation features for different voice qualities, derives modification rules, and applies them to improve voice quality in speech synthesis, demonstrating effective transformations without quality loss.
Contribution
It introduces a novel analysis of excitation features for voice qualities and develops a transformation system for voice quality modification in speech synthesis.
Findings
Significant excitation differences among voice qualities
Effective voice quality transformation in synthesis
Maintained speech quality after modification
Abstract
This paper investigates the differences occuring in the excitation for different voice qualities. Its goal is two-fold. First a large corpus containing three voice qualities (modal, soft and loud) uttered by the same speaker is analyzed and significant differences in characteristics extracted from the excitation are observed. Secondly rules of modification derived from the analysis are used to build a voice quality transformation system applied as a post-process to HMM-based speech synthesis. The system is shown to effectively achieve the transformations while maintaining the delivered quality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders
