In Search of Autocorrelation Based Vocal Cord Cues for Speaker   Identification

Md. Sahidullah; Goutam Saha

arXiv:1105.2095·cs.HC·May 24, 2011

In Search of Autocorrelation Based Vocal Cord Cues for Speaker Identification

Md. Sahidullah, Goutam Saha

PDF

Open Access

TL;DR

This paper explores autocorrelation-based features from the LP residual of speech signals to improve speaker identification accuracy by capturing vocal cord cues, complementing traditional vocal tract features.

Contribution

It introduces a novel autocorrelation-based feature extraction method from LP residuals for speaker identification, demonstrating improved accuracy when fused with traditional features.

Findings

01

Autocorrelation features provide complementary vocal cord information.

02

Fusion of these features with traditional ones enhances speaker identification accuracy.

03

Validated results on two public databases show improved performance.

Abstract

In this paper we investigate a technique to find out vocal source based features from the LP residual of speech signal for automatic speaker identification. Autocorrelation with some specific lag is computed for the residual signal to derive these features. Compared to traditional features like MFCC, PLPCC which represent vocal tract information, these features represent complementary vocal cord information. Our experiment in fusing these two sources of information in representing speaker characteristics yield better speaker identification accuracy. We have used Gaussian mixture model (GMM) based speaker modeling and results are shown on two public databases to validate our proposition.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing