Data-driven Detection and Analysis of the Patterns of Creaky Voice

Thomas Drugman; John Kane; Christer Gobl

arXiv:2006.00518·eess.AS·June 2, 2020

Data-driven Detection and Analysis of the Patterns of Creaky Voice

Thomas Drugman, John Kane, Christer Gobl

PDF

Open Access

TL;DR

This study analyzes the acoustic patterns of creaky voice across languages and speakers, improving automatic detection accuracy and revealing diverse, speaker-dependent creaky patterns with implications for speech technology.

Contribution

It introduces a comprehensive analysis of creaky voice patterns using mutual information and classification, enhancing detection methods and understanding of speaker variability.

Findings

01

Improved creaky voice detection accuracy over previous methods.

02

Identification of multiple distinct creaky voice patterns.

03

Significant speaker-dependent variability in creaky patterns.

Abstract

This paper investigates the temporal excitation patterns of creaky voice. Creaky voice is a voice quality frequently used as a phrase-boundary marker, but also as a means of portraying attitude, affective states and even social status. Consequently, the automatic detection and modelling of creaky voice may have implications for speech technology applications. The acoustic characteristics of creaky voice are, however, rather distinct from modal phonation. Further, several acoustic patterns can bring about the perception of creaky voice, thereby complicating the strategies used for its automatic detection, analysis and modelling. The present study is carried out using a variety of languages, speakers, and on both read and conversational data and involves a mutual information-based assessment of the various acoustic features proposed in the literature for detecting creaky voice. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing