Speaker recognition using residual signal of linear and nonlinear prediction models
Marcos Faundez-Zanuy, Daniel Rodr\'iguez-Porcheron

TL;DR
This paper explores how residual signals from linear and nonlinear prediction models can enhance speaker recognition accuracy, showing significant error rate reductions when combining residual energy measures with LPCC coefficients.
Contribution
It introduces a method that combines residual signal analysis with LPCC coefficients, improving speaker recognition performance over classical approaches.
Findings
Linear prediction residuals reduce error rate by 2.63%.
Nonlinear neural network residuals reduce error rate by 3.68%.
Combining residuals with LPCC coefficients improves recognition accuracy.
Abstract
This Paper discusses the usefulness of the residual signal for speaker recognition. It is shown that the combination of both a measure defined over LPCC coefficients and a measure defined over the energy of the residual signal gives rise to an improvement over the classical method which considers only the LPCC coefficients. If the residual signal is obtained from a linear prediction analysis, the improvement is 2.63% (error rate drops from 6.31% to 3.68%) and if it is computed through a nonlinear predictive neural nets based model, the improvement is 3.68%.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Face and Expression Recognition · Speech and Audio Processing
