A Simple Feature Method for Prosody Rhythm Comparison

Mariana Juli\~ao; Alberto Abad; Helena Moniz

arXiv:2212.10201·eess.AS·December 21, 2022

A Simple Feature Method for Prosody Rhythm Comparison

Mariana Juli\~ao, Alberto Abad, Helena Moniz

PDF

Open Access

TL;DR

This paper introduces an unsupervised, content-independent method called Peak Embedding for assessing prosody rhythm, demonstrating its effectiveness through clustering metrics on speech data.

Contribution

It proposes a novel fixed-length representation for rhythm comparison that simplifies and improves upon traditional, cumbersome measurement techniques.

Findings

01

Achieved 0.444 Silhouette Coefficient with PE and Loudness.

02

Attained 0.979 Global Separability Index with PE, Pitch, and Loudness.

03

Demonstrated effective clustering of speech units based on rhythm features.

Abstract

Of all components of Prosody, Rhythm has been regarded as the hardest to address, as it is utterly linked to Pitch and Intensity. Nevertheless, Rhythm is a very good indicator of a speaker's fluency in a foreign language or even of some diseases. Canonical ways to measure Rhythm, such as $Δ C$ or $% V$ , involve a cumbersome process of segment alignment, often leading to modest and questionable results. Perceptively, however, rhythm does not sound as difficult, as humans can grasp it even when the text is not fully intelligible. In this work, we develop an empirical and unsupervised method of rhythm assessment, which does not rely on the content. We have created a fixed-length representation of each utterance, Peak Embedding (PE), which codifies the proportional distance between peaks of the chosen Low-Level Descriptors. Clustering pairs of small sentence-like units, we have…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhonetics and Phonology Research · Natural Language Processing Techniques · Speech Recognition and Synthesis