An Investigation of Universal Background Sparse Coding Based Speaker Verification on TIMIT
Xiao-Lei Zhang

TL;DR
This paper introduces a universal background sparse coding (UBSC) model for speaker verification, which uses ensemble clustering and sparse coding to avoid Gaussian assumptions and local minima, showing comparable performance to GMM on TIMIT.
Contribution
The paper presents UBSC, a novel universal background model for speaker verification that employs ensemble clustering and sparse coding without Gaussian assumptions.
Findings
UBSC performs comparably to GMM on TIMIT.
UBSC avoids local minima and Gaussian assumptions.
Experimental results validate UBSC's effectiveness.
Abstract
In this paper, we propose a universal background model, named universal background sparse coding (UBSC), for speaker verification. The proposed method trains an ensemble of clusterings by data resampling, and produces sparse codes from the clusterings by one-nearest-neighbor optimization plus binarization. The main advantage of UBSC is that it does not suffer from local minima and does not make Gaussian assumptions on data distributions. We evaluated UBSC on a clean speech corpus---TIMIT. We used the cosine similarity and inner product similarity as the scoring methods of a trial. Experimental results show that UBSC is comparable to Gaussian mixture model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
