Graph-based Label Propagation for Semi-Supervised Speaker Identification

Long Chen; Venkatesh Ravichandran; Andreas Stolcke

arXiv:2106.08207·cs.SD·February 22, 2022

Graph-based Label Propagation for Semi-Supervised Speaker Identification

Long Chen, Venkatesh Ravichandran, Andreas Stolcke

PDF

TL;DR

This paper introduces a graph-based semi-supervised learning method for speaker identification in household scenarios, effectively utilizing unlabeled data to enhance accuracy by propagating speaker labels across utterance graphs.

Contribution

The work presents a novel graph-based approach focusing on speaker label inference, contrasting with traditional embedding-focused methods, and demonstrates improved performance using unlabeled data.

Findings

01

Improved speaker identification accuracy over state-of-the-art methods.

02

Effective use of unlabeled data through graph-based label propagation.

03

Demonstrated benefits on VoxCeleb dataset.

Abstract

Speaker identification in the household scenario (e.g., for smart speakers) is typically based on only a few enrollment utterances but a much larger set of unlabeled data, suggesting semisupervised learning to improve speaker profiles. We propose a graph-based semi-supervised learning approach for speaker identification in the household scenario, to leverage the unlabeled speech samples. In contrast to most of the works in speaker recognition that focus on speaker-discriminative embeddings, this work focuses on speaker label inference (scoring). Given a pre-trained embedding extractor, graph-based learning allows us to integrate information about both labeled and unlabeled utterances. Considering each utterance as a graph node, we represent pairwise utterance similarity scores as edge weights. Graphs are constructed per household, and speaker identities are propagated to unlabeled nodes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.