Graph Convolutional Network Based Semi-Supervised Learning on   Multi-Speaker Meeting Data

Fuchuan Tong; Siqi Zheng; Min Zhang; Yafeng Chen; Hongbin Suo,; Qingyang Hong; Lin Li

arXiv:2204.11501·eess.AS·April 26, 2022

Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data

Fuchuan Tong, Siqi Zheng, Min Zhang, Yafeng Chen, Hongbin Suo,, Qingyang Hong, Lin Li

PDF

Open Access

TL;DR

This paper introduces a GCN-based semi-supervised learning method that leverages unlabeled multi-party meeting data to improve speaker recognition accuracy through iterative clustering and self-correction.

Contribution

It proposes a novel GCN-based semi-supervised approach with a self-correcting mechanism for clustering unlabeled speaker data in meetings.

Findings

01

Improved speaker recognition accuracy using unlabeled data

02

Effective use of pseudo-labels in GCN training

03

Self-correcting mechanism enhances clustering performance

Abstract

Unsupervised clustering on speakers is becoming increasingly important for its potential uses in semi-supervised learning. In reality, we are often presented with enormous amounts of unlabeled data from multi-party meetings and discussions. An effective unsupervised clustering approach would allow us to significantly increase the amount of training data without additional costs for annotations. Recently, methods based on graph convolutional networks (GCN) have received growing attention for unsupervised clustering, as these methods exploit the connectivity patterns between nodes to improve learning performance. In this work, we present a GCN-based approach for semi-supervised learning. Given a pre-trained embedding extractor, a graph convolutional network is trained on the labeled data and clusters unlabeled data with "pseudo-labels". We present a self-correcting training mechanism that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Speech Recognition and Synthesis · Topic Modeling