Neural Network Based Speaker Classification and Verification Systems   with Enhanced Features

Zhenhao Ge; Ananth N. Iyer; Srinath Cheluvaraja; Ram Sundaram; Aravind; Ganapathiraju

arXiv:1702.02289·cs.SD·March 20, 2017·2 cites

Neural Network Based Speaker Classification and Verification Systems with Enhanced Features

Zhenhao Ge, Ananth N. Iyer, Srinath Cheluvaraja, Ram Sundaram, Aravind, Ganapathiraju

PDF

Open Access 1 Repo

TL;DR

This paper introduces a neural network framework with enhanced features for speaker recognition, achieving high accuracy and low error rates through optimized features, training, and score normalization techniques.

Contribution

It presents a novel neural network-based speaker recognition system with optimized features, training methods, and normalization techniques that improve performance over previous approaches.

Findings

01

Achieved 100% classification rate on TIMIT dataset.

02

Less than 6% Equal Error Rate in speaker verification.

03

Enhanced features and normalization significantly improve system performance.

Abstract

This work presents a novel framework based on feed-forward neural network for text-independent speaker classification and verification, two related systems of speaker recognition. With optimized features and model training, it achieves 100% classification rate in classification and less than 6% Equal Error Rate (ERR), using merely about 1 second and 5 seconds of data respectively. Features with stricter Voice Active Detection (VAD) than the regular one for speech recognition ensure extracting stronger voiced portion for speaker recognition, speaker-level mean and variance normalization helps to eliminate the discrepancy between samples from the same speaker. Both are proven to improve the system performance. In building the neural network speaker classifier, the network structure parameters are optimized with grid search and dynamically reduced regularization parameters are used to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

VozWallet/vw_idefix
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing