Speaker recognition using residual signal of linear and nonlinear   prediction models

Marcos Faundez-Zanuy; Daniel Rodr\'iguez-Porcheron

arXiv:2203.09231·cs.SD·March 18, 2022

Speaker recognition using residual signal of linear and nonlinear prediction models

Marcos Faundez-Zanuy, Daniel Rodr\'iguez-Porcheron

PDF

Open Access

TL;DR

This paper explores how residual signals from linear and nonlinear prediction models can enhance speaker recognition accuracy, showing significant error rate reductions when combining residual energy measures with LPCC coefficients.

Contribution

It introduces a method that combines residual signal analysis with LPCC coefficients, improving speaker recognition performance over classical approaches.

Findings

01

Linear prediction residuals reduce error rate by 2.63%.

02

Nonlinear neural network residuals reduce error rate by 3.68%.

03

Combining residuals with LPCC coefficients improves recognition accuracy.

Abstract

This Paper discusses the usefulness of the residual signal for speaker recognition. It is shown that the combination of both a measure defined over LPCC coefficients and a measure defined over the energy of the residual signal gives rise to an improvement over the classical method which considers only the LPCC coefficients. If the residual signal is obtained from a linear prediction analysis, the improvement is 2.63% (error rate drops from 6.31% to 3.68%) and if it is computed through a nonlinear predictive neural nets based model, the improvement is 3.68%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Face and Expression Recognition · Speech and Audio Processing