Automatic measurement of vowel duration via structured prediction

Yossi Adi; Joseph Keshet; Emily Cibelli; Erin Gustafson; Cynthia; Clopper; Matthew Goldrick

arXiv:1610.08166·stat.ML·March 8, 2017

Automatic measurement of vowel duration via structured prediction

Yossi Adi, Joseph Keshet, Emily Cibelli, Erin Gustafson, Cynthia, Clopper, Matthew Goldrick

PDF

1 Repo

TL;DR

This paper introduces a structured prediction machine learning model that automatically measures vowel duration from acoustic signals, outperforming traditional HMM-based forced aligners and reducing the need for manual annotation in phonetic studies.

Contribution

The paper presents a novel structured prediction approach for automatic vowel duration measurement that does not require phonetic transcription, improving accuracy over existing methods.

Findings

01

Model outperforms HMM-based forced aligners in accuracy

02

Requires no phonetic or orthographic transcription

03

Demonstrates scalability for phonetic research

Abstract

A key barrier to making phonetic studies scalable and replicable is the need to rely on subjective, manual annotation. To help meet this challenge, a machine learning algorithm was developed for automatic measurement of a widely used phonetic measure: vowel duration. Manually-annotated data were used to train a model that takes as input an arbitrary length segment of the acoustic signal containing a single vowel that is preceded and followed by consonants and outputs the duration of the vowel. The model is based on the structured prediction framework. The input signal and a hypothesized set of a vowel's onset and offset are mapped to an abstract vector space by a set of acoustic feature functions. The learning algorithm is trained in this space to minimize the difference in expectations between predicted and manually-measured vowel durations. The trained model can then automatically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adiyoss/AutoVowelDuration
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.