Recursive Whitening Transformation for Speaker Recognition on Language   Mismatched Condition

Suwon Shon; Seongkyu Mun; Hanseok Ko

arXiv:1708.01232·cs.SD·August 29, 2017·2 cites

Recursive Whitening Transformation for Speaker Recognition on Language Mismatched Condition

Suwon Shon, Seongkyu Mun, Hanseok Ko

PDF

Open Access

TL;DR

This paper introduces a recursive whitening transformation approach to improve speaker recognition accuracy under language mismatched conditions, addressing a less-explored challenge in the field.

Contribution

The paper proposes a novel recursive whitening transformation method specifically designed to mitigate language mismatches in speaker recognition systems.

Findings

01

Effective in reducing language mismatch effects in speaker recognition

02

Outperforms baseline systems on non-English speaker recognition tasks

03

Validated on Speaker Recognition Evaluation 2016 trials

Abstract

Recently in speaker recognition, performance degradation due to the channel domain mismatched condition has been actively addressed. However, the mismatches arising from language is yet to be sufficiently addressed. This paper proposes an approach which employs recursive whitening transformation to mitigate the language mismatched condition. The proposed method is based on the multiple whitening transformation, which is intended to remove un-whitened residual components in the dataset associated with i-vector length normalization. The experiments were conducted on the Speaker Recognition Evaluation 2016 trials of which the task is non-English speaker recognition using development dataset consist of both a large scale out-of-domain (English) dataset and an extremely low-quantity in-domain (non-English) dataset. For performance comparison, we develop a state-of- the-art system using deep…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing