Shared latent subspace modelling within Gaussian-Binary Restricted Boltzmann Machines for NIST i-Vector Challenge 2014
Danila Doroshin, Alexander Yamshinin, Nikolay Lubimov, Marina, Nastasenko, Mikhail Kotov, Maxim Tkachenko

TL;DR
This paper introduces a novel speaker subspace modelling approach using Gaussian-Binary Restricted Boltzmann Machines with shared speaker factors, demonstrating improved verification techniques on the NIST i-vector dataset.
Contribution
It proposes a new GRBM-based model with shared speaker factors and introduces maximum likelihood estimation and scoring methods for speaker verification.
Findings
Effective speaker verification on NIST i-vector dataset
Shared latent subspace improves modeling accuracy
New scoring techniques enhance verification performance
Abstract
This paper presents a novel approach to speaker subspace modelling based on Gaussian-Binary Restricted Boltzmann Machines (GRBM). The proposed model is based on the idea of shared factors as in the Probabilistic Linear Discriminant Analysis (PLDA). GRBM hidden layer is divided into speaker and channel factors, herein the speaker factor is shared over all vectors of the speaker. Then Maximum Likelihood Parameter Estimation (MLE) for proposed model is introduced. Various new scoring techniques for speaker verification using GRBM are proposed. The results for NIST i-vector Challenge 2014 dataset are presented.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
