Accurate analysis of the pitch pulse-based magnitude/phase structure of natural vowels and assessment of three lightweight time/frequency voicing restoration methods

An\'ibal J. S. Ferreira; Luis M. T. Jesus; Laurentino M. M. Leal; Jorge E. F. Spratley

arXiv:2506.06675·eess.AS·June 10, 2025

Accurate analysis of the pitch pulse-based magnitude/phase structure of natural vowels and assessment of three lightweight time/frequency voicing restoration methods

An\'ibal J. S. Ferreira, Luis M. T. Jesus, Laurentino M. M. Leal, Jorge E. F. Spratley

PDF

Open Access

TL;DR

This paper introduces a novel method for analyzing the harmonic phase/magnitude structure of natural vowels and evaluates three lightweight, real-time voice restoration techniques suitable for low-resource devices.

Contribution

It proposes a new algorithm for segmenting pitch pulses and compares three model-based synthetic voicing methods for whispered speech restoration.

Findings

01

Differences between sustained and co-articulated vowels identified

02

Three signal reconstruction methods compared both objectively and subjectively

03

Physiologically-inspired filtering shows promising results

Abstract

Whispered speech is produced when the vocal folds are not used, either intentionally, or due to a temporary or permanent voice condition. The essential difference between natural speech and whispered speech is that periodic signal components that exist in certain regions of the former, called voiced regions, as a consequence of the vibration of the vocal folds, are missing in the latter. The restoration of natural speech from whispered speech requires delicate signal processing procedures that are especially useful if they can be implemented on low-resourced portable devices, in real-time, and on-the-fly, taking advantage of the established source-filter paradigm of voice production and related models. This paper addresses two challenges that are intertwined and are key in informing and making viable this envisioned technological realization. The first challenge involves characterizing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders · Phonetics and Phonology Research · Speech Recognition and Synthesis