A Novel Windowing Technique for Efficient Computation of MFCC for   Speaker Recognition

Md. Sahidullah; Goutam Saha

arXiv:1206.2437·cs.CV·June 5, 2015

A Novel Windowing Technique for Efficient Computation of MFCC for Speaker Recognition

Md. Sahidullah, Goutam Saha

PDF

TL;DR

This paper introduces a new windowing method for computing MFCCs that enhances speaker recognition accuracy by incorporating spectral slope and phase information, outperforming traditional and multitaper methods.

Contribution

A novel windowing technique based on DTFT differentiation that improves MFCC computation for speaker recognition.

Findings

01

Significant performance improvement over baseline Hamming window.

02

Outperforms recent multitaper windowing techniques.

03

Mathematically incorporates spectral slope and phase in cepstrum.

Abstract

In this paper, we propose a novel family of windowing technique to compute Mel Frequency Cepstral Coefficient (MFCC) for automatic speaker recognition from speech. The proposed method is based on fundamental property of discrete time Fourier transform (DTFT) related to differentiation in frequency domain. Classical windowing scheme such as Hamming window is modified to obtain derivatives of discrete time Fourier transform coefficients. It has been mathematically shown that the slope and phase of power spectrum are inherently incorporated in newly computed cepstrum. Speaker recognition systems based on our proposed family of window functions are shown to attain substantial and consistent performance improvement over baseline single tapered Hamming window as well as recently proposed multitaper windowing technique.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.