Intrinsic normalization and extrinsic denormalization of formant data of   vowels

T.V. Ananthapadmanabha; A.G. Ramakrishnan

arXiv:1609.05104·cs.SD·December 13, 2016

Intrinsic normalization and extrinsic denormalization of formant data of vowels

T.V. Ananthapadmanabha, A.G. Ramakrishnan

PDF

Open Access

TL;DR

This paper introduces a combined normalization and denormalization method for vowel formant data that reduces talker variability and improves vowel classification accuracy.

Contribution

It proposes a novel speaker-extrinsic re-scaling procedure that enhances existing normalization techniques for better vowel space representation.

Findings

01

Improved vowel classification accuracy over existing methods

02

Reduced talker-induced spread in vowel formant data

03

Effective combination of intrinsic and extrinsic normalization techniques

Abstract

Using a known speaker-intrinsic normalization procedure, formant data are scaled by the reciprocal of the geometric mean of the first three formant frequencies. This reduces the influence of the talker but results in a distorted vowel space. The proposed speaker-extrinsic procedure re-scales the normalized values by the mean formant values of vowels. When tested on the formant data of vowels published by Peterson and Barney, the combined approach leads to well separated clusters by reducing the spread due to talkers. The proposed procedure performs better than two top-ranked normalization procedures based on the accuracy of vowel classification as the objective measure.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing