Speaker Recognition -- Wavelet Packet Based Multiresolution Feature Extraction Approach

Saurabh Bhardwaj; Smriti Srivastava; Abhishek Bhandari; Krit Gupta; Hitesh Bahl; J.R.P. Gupta

arXiv:2512.18902·cs.SD·December 25, 2025

Speaker Recognition -- Wavelet Packet Based Multiresolution Feature Extraction Approach

Saurabh Bhardwaj, Smriti Srivastava, Abhishek Bhandari, Krit Gupta, Hitesh Bahl, J.R.P. Gupta

PDF

Open Access

TL;DR

This paper introduces a hybrid wavelet packet and MFCC feature extraction method for speaker recognition, demonstrating improved accuracy and noise robustness on multiple speech datasets.

Contribution

It presents a novel hybrid feature extraction approach combining MFCC and wavelet packet transform for enhanced speaker recognition performance.

Findings

01

Improved speaker identification accuracy.

02

Enhanced noise robustness in speaker verification.

03

Effective performance on multiple speech corpora.

Abstract

This paper proposes a novel Wavelet Packet based feature extraction approach for the task of text independent speaker recognition. The features are extracted by using the combination of Mel Frequency Cepstral Coefficient (MFCC) and Wavelet Packet Transform (WPT).Hybrid Features technique uses the advantage of human ear simulation offered by MFCC combining it with multi-resolution property and noise robustness of WPT. To check the validity of the proposed approach for the text independent speaker identification and verification we have used the Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM) respectively as the classifiers. The proposed paradigm is tested on voxforge speech corpus and CSTR US KED Timit database. The paradigm is also evaluated after adding standard noise signal at different level of SNRs for evaluating the noise robustness. Experimental results show that better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Biometric Identification and Security