A Perceptual Alphabet for the 10-dimensional Phonetic-prosodic Space

Elaine Y L Tsiang

arXiv:1306.2593·cs.SD·January 23, 2020

A Perceptual Alphabet for the 10-dimensional Phonetic-prosodic Space

Elaine Y L Tsiang

PDF

Open Access

TL;DR

This paper introduces the IHA, a 10-dimensional perceptual alphabet for speech, based on observable features rather than articulatory ones, and discusses its implementation in a speech recognizer.

Contribution

It defines a novel 10-D perceptual phonetic alphabet (IHA) and models speech as a sequence in this space, advancing previous articulatory-based models.

Findings

01

IHA has been implemented in a speech recognizer.

02

Speech modeled as a chain in the 4-D phonetic subspace.

03

The model is based on the oral billiards analogy.

Abstract

We define an alphabet, the IHA, of the 10-D phonetic-prosodic space. The dimensions of this space are perceptual observables, rather than articulatory specifications. Speech is defined as a random chain in time of the 4-D phonetic subspace, that is, a symbolic sequence, augmented with diacritics of the remaining 6-D prosodic subspace. The definitions here are based on the model of speech of oral billiards, and supersedes an earlier version. This paper only enumerates the IHA in detail as a supplement to the exposition of oral billiards in a separate paper. The IHA has been implemented as the target random variable in a speech recognizer.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Image Retrieval and Classification Techniques · Speech Recognition and Synthesis