Glottal Source Processing: from Analysis to Applications

Thomas Drugman; Paavo Alku; Abeer Alwan; Bayya Yegnanarayana

arXiv:1912.12604·cs.SD·January 1, 2020

Glottal Source Processing: from Analysis to Applications

Thomas Drugman, Paavo Alku, Abeer Alwan, Bayya Yegnanarayana

PDF

Open Access

TL;DR

This paper reviews techniques for analyzing the glottal source in speech, emphasizing its potential to enhance voice technology applications despite the complexity of extracting glottal flow information.

Contribution

It provides a comprehensive overview of methods for glottal source processing and discusses their integration into voice technology applications.

Findings

01

Glottal flow analysis offers valuable complementary information to traditional acoustic features.

02

Various techniques for pitch tracking and glottal flow estimation are discussed.

03

Integration of glottal analysis can improve voice technology performance.

Abstract

The great majority of current voice technology applications relies on acoustic features characterizing the vocal tract response, such as the widely used MFCC of LPC parameters. Nonetheless, the airflow passing through the vocal folds, and called glottal flow, is expected to exhibit a relevant complementarity. Unfortunately, glottal analysis from speech recordings requires specific and more complex processing operations, which explains why it has been generally avoided. This review gives a general overview of techniques which have been designed for glottal source processing. Starting from fundamental analysis tools of pitch tracking, glottal closure instant detection, glottal flow estimation and modelling, this paper then highlights how these solutions can be properly integrated within various voice technology applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders