Detection of transitions between broad phonetic classes in a speech   signal

T V Ananthapadmanabha; K V Vijay Girish; A G Ramakrishnan

arXiv:1411.0370·cs.SD·November 4, 2014

Detection of transitions between broad phonetic classes in a speech signal

T V Ananthapadmanabha, K V Vijay Girish, A G Ramakrishnan

PDF

Open Access

TL;DR

This paper presents a hierarchical method for detecting phonetic class transitions in speech signals, achieving high accuracy and comparable or better results than existing methods on the TIMIT database.

Contribution

A novel hierarchical approach for detecting broad phonetic class transitions in speech signals with high accuracy and robustness.

Findings

01

93.6% transition detection within 20 ms tolerance

02

83.5% accuracy in class onset detection

03

Performance comparable or superior to state-of-the-art methods

Abstract

Detection of transitions between broad phonetic classes in a speech signal is an important problem which has applications such as landmark detection and segmentation. The proposed hierarchical method detects silence to non-silence transitions, high amplitude (mostly sonorants) to low ampli- tude (mostly fricatives/affricates/stop bursts) transitions and vice-versa. A subset of the extremum (minimum or maximum) samples between every pair of successive zero-crossings is selected above a second pass threshold, from each bandpass filtered speech signal frame. Relative to the mid-point (reference) of a frame, locations of the first and the last extrema lie on either side, if the speech signal belongs to a homogeneous segment; else, both these locations lie on the left or the right side of the reference, indicating a transition frame. When tested on the entire TIMIT database, of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Phonetics and Phonology Research · Speech Recognition and Synthesis