Recursive Whitening Transformation for Speaker Recognition on Language Mismatched Condition
Suwon Shon, Seongkyu Mun, Hanseok Ko

TL;DR
This paper introduces a recursive whitening transformation approach to improve speaker recognition accuracy under language mismatched conditions, addressing a less-explored challenge in the field.
Contribution
The paper proposes a novel recursive whitening transformation method specifically designed to mitigate language mismatches in speaker recognition systems.
Findings
Effective in reducing language mismatch effects in speaker recognition
Outperforms baseline systems on non-English speaker recognition tasks
Validated on Speaker Recognition Evaluation 2016 trials
Abstract
Recently in speaker recognition, performance degradation due to the channel domain mismatched condition has been actively addressed. However, the mismatches arising from language is yet to be sufficiently addressed. This paper proposes an approach which employs recursive whitening transformation to mitigate the language mismatched condition. The proposed method is based on the multiple whitening transformation, which is intended to remove un-whitened residual components in the dataset associated with i-vector length normalization. The experiments were conducted on the Speaker Recognition Evaluation 2016 trials of which the task is non-English speaker recognition using development dataset consist of both a large scale out-of-domain (English) dataset and an extremely low-quantity in-domain (non-English) dataset. For performance comparison, we develop a state-of- the-art system using deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing
