Speaker recognition improvement using blind inversion of distortions
Marcos Faundez-Zanuy, Jordi Sole-Casals

TL;DR
This paper introduces a method to invert nonlinear distortions in speech signals to enhance speaker recognition accuracy, especially under saturation conditions, by combining data fusion techniques with distortion compensation.
Contribution
It presents a novel approach to invert nonlinear distortions in speech signals, improving recognition rates under saturation mismatch conditions.
Findings
Recognition rate improved from 80% to 88.57% with saturation.
Recognition rate with clean speech reached 87.76%.
Combining distortion compensation with data fusion enhances performance.
Abstract
In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results shows that a combination of data fusion with and without nonlinear distortion compensation can improve the recognition rates with saturated test sentences from 80% to 88.57%, while the results with clean speech (without saturation) is 87.76% for one microphone.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Speech Recognition and Synthesis
