Causal-Anticausal Decomposition of Speech using Complex Cepstrum for   Glottal Source Estimation

Thomas Drugman; Baris Bozkurt; Thierry Dutoit

arXiv:1912.12843·cs.SD·January 1, 2020

Causal-Anticausal Decomposition of Speech using Complex Cepstrum for Glottal Source Estimation

Thomas Drugman, Baris Bozkurt, Thierry Dutoit

PDF

Open Access

TL;DR

This paper explores using complex cepstrum for efficient glottal flow estimation, demonstrating comparable accuracy to ZZT while offering faster computation, with potential applications in voice quality analysis.

Contribution

It introduces a novel application of complex cepstrum for glottal source estimation, showing its effectiveness and speed advantages over existing ZZT-based methods.

Findings

01

Complex cepstrum effectively decomposes speech into causal and anticausal components.

02

The method achieves similar glottal estimates as ZZT but with higher computational speed.

03

Potential for use in voice quality analysis on large speech datasets.

Abstract

Complex cepstrum is known in the literature for linearly separating causal and anticausal components. Relying on advances achieved by the Zeros of the Z-Transform (ZZT) technique, we here investigate the possibility of using complex cepstrum for glottal flow estimation on a large-scale database. Via a systematic study of the windowing effects on the deconvolution quality, we show that the complex cepstrum causal-anticausal decomposition can be effectively used for glottal flow estimation when specific windowing criteria are met. It is also shown that this complex cepstral decomposition gives similar glottal estimates as obtained with the ZZT method. However, as complex cepstrum uses FFT operations instead of requiring the factoring of high-degree polynomials, the method benefits from a much higher speed. Finally in our tests on a large corpus of real expressive speech, we show that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Voice and Speech Disorders