Feature Normalisation for Robust Speech Recognition

D. S. Pavan Kumar

arXiv:1507.04019·cs.CL·July 16, 2015·1 cites

Feature Normalisation for Robust Speech Recognition

D. S. Pavan Kumar

PDF

Open Access

TL;DR

This paper investigates various feature normalisation techniques, including subspace projection and a modified SPLICE algorithm, to improve noise robustness in speech recognition systems, demonstrating enhanced performance in noisy environments.

Contribution

It introduces a subspace projection method using NMF for noise-robust feature extraction and a modified SPLICE training process for better noise adaptation.

Findings

01

Features become more noise-robust with the proposed subspace method.

02

Modified SPLICE improves recognition accuracy across various noise conditions.

03

The combined approach outperforms existing noise-robust techniques.

Abstract

Speech recognition system performance degrades in noisy environments. If the acoustic models are built using features of clean utterances, the features of a noisy test utterance would be acoustically mismatched with the trained model. This gives poor likelihoods and poor recognition accuracy. Model adaptation and feature normalisation are two broad areas that address this problem. While the former often gives better performance, the latter involves estimation of lesser number of parameters, making the system feasible for practical implementations. This research focuses on the efficacies of various subspace, statistical and stereo based feature normalisation techniques. A subspace projection based method has been investigated as a standalone and adjunct technique involving reconstruction of noisy speech features from a precomputed set of clean speech building-blocks. The building…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing