Maximum Voiced Frequency Estimation: Exploiting Amplitude and Phase   Spectra

Thomas Drugman; Yannis Stylianou

arXiv:2006.00521·eess.AS·June 2, 2020

Maximum Voiced Frequency Estimation: Exploiting Amplitude and Phase Spectra

Thomas Drugman, Yannis Stylianou

PDF

Open Access

TL;DR

This paper introduces a novel MVF estimation method that combines amplitude and phase spectra, improving accuracy especially in high-pitched voices, and enhances speech synthesis quality.

Contribution

It presents a new MVF estimation approach utilizing phase information alongside amplitude spectra, outperforming existing methods in speech and singing voice synthesis.

Findings

01

Superior performance in objective evaluations

02

Significant perceptual improvements in high-pitched voices

03

Outperforms state-of-the-art MVF estimation methods

Abstract

Maximum Voiced Frequency (MVF) is used in various speech models as the spectral boundary separating periodic and aperiodic components during the production of voiced sounds. Recent studies have shown that its proper estimation and modeling enhance the quality of statistical parametric speech synthesizers. Contrastingly, these same methods of MVF estimation have been reported to degrade the performance of singing voice synthesizers. This paper proposes a new approach for MVF estimation which exploits both amplitude and phase spectra. It is shown that phase conveys relevant information about the harmonicity of the voice signal, and that it can be jointly used with features derived from the amplitude spectrum. This information is further integrated into a maximum likelihood criterion which provides a decision about the MVF estimate. The proposed technique is compared to two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis