On the Use of Different Feature Extraction Methods for Linear and Non   Linear kernels

Imen Trabelsi; Dorra Ben Ayed

arXiv:1406.7314·cs.CL·July 1, 2014

On the Use of Different Feature Extraction Methods for Linear and Non Linear kernels

Imen Trabelsi, Dorra Ben Ayed

PDF

Open Access

TL;DR

This paper compares various speech feature extraction methods like LPC, MFCC, and PLP, analyzing their impact on speaker identification performance using GMM and SVM with different kernels.

Contribution

It provides a comparative evaluation of multiple feature extraction techniques and normalization methods for speaker identification with different kernel types.

Findings

01

MFCC with RASTA filtering performs best with SVM.

02

PLP features show robustness across normalization methods.

03

Linear kernels outperform non-linear kernels in certain configurations.

Abstract

The speech feature extraction has been a key focus in robust speech recognition research; it significantly affects the recognition performance. In this paper, we first study a set of different features extraction methods such as linear predictive coding (LPC), mel frequency cepstral coefficient (MFCC) and perceptual linear prediction (PLP) with several features normalization techniques like rasta filtering and cepstral mean subtraction (CMS). Based on this, a comparative evaluation of these features is performed on the task of text independent speaker identification using a combination between gaussian mixture models (GMM) and linear and non-linear kernels based on support vector machine (SVM).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing