Minimal Feature Analysis for Isolated Digit Recognition for varying   encoding rates in noisy environments

Muskan Garg; Naveen Aggarwal

arXiv:2208.13100·cs.CL·August 30, 2022

Minimal Feature Analysis for Isolated Digit Recognition for varying encoding rates in noisy environments

Muskan Garg, Naveen Aggarwal

PDF

Open Access

TL;DR

This study analyzes isolated digit speech recognition performance across various encoding rates and noise levels using multiple feature extraction techniques and HMMs to identify optimal conditions for real-time noisy environments.

Contribution

It provides a comparative analysis of feature extraction methods and encoding rates for digit recognition in noisy settings, aiding real-time speech recognition system design.

Findings

01

MFCC and PLP features perform best in noisy environments.

02

Optimal bit rate identified for balancing quality and efficiency.

03

Performance varies significantly with noise type and level.

Abstract

This research work is about recent development made in speech recognition. In this research work, analysis of isolated digit recognition in the presence of different bit rates and at different noise levels has been performed. This research work has been carried using audacity and HTK toolkit. Hidden Markov Model (HMM) is the recognition model which was used to perform this experiment. The feature extraction techniques used are Mel Frequency Cepstrum coefficient (MFCC), Linear Predictive Coding (LPC), perceptual linear predictive (PLP), mel spectrum (MELSPEC), filter bank (FBANK). There were three types of different noise levels which have been considered for testing of data. These include random noise, fan noise and random noise in real time environment. This was done to analyse the best environment which can used for real time applications. Further, five different types of commonly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Data Compression Techniques · Blind Source Separation Techniques