Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data
Fuchuan Tong, Siqi Zheng, Min Zhang, Yafeng Chen, Hongbin Suo,, Qingyang Hong, Lin Li

TL;DR
This paper introduces a GCN-based semi-supervised learning method that leverages unlabeled multi-party meeting data to improve speaker recognition accuracy through iterative clustering and self-correction.
Contribution
It proposes a novel GCN-based semi-supervised approach with a self-correcting mechanism for clustering unlabeled speaker data in meetings.
Findings
Improved speaker recognition accuracy using unlabeled data
Effective use of pseudo-labels in GCN training
Self-correcting mechanism enhances clustering performance
Abstract
Unsupervised clustering on speakers is becoming increasingly important for its potential uses in semi-supervised learning. In reality, we are often presented with enormous amounts of unlabeled data from multi-party meetings and discussions. An effective unsupervised clustering approach would allow us to significantly increase the amount of training data without additional costs for annotations. Recently, methods based on graph convolutional networks (GCN) have received growing attention for unsupervised clustering, as these methods exploit the connectivity patterns between nodes to improve learning performance. In this work, we present a GCN-based approach for semi-supervised learning. Given a pre-trained embedding extractor, a graph convolutional network is trained on the labeled data and clusters unlabeled data with "pseudo-labels". We present a self-correcting training mechanism that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Speech Recognition and Synthesis · Topic Modeling
