Self-supervised Representation Learning With Path Integral Clustering   For Speaker Diarization

Prachi Singh; Sriram Ganapathy

arXiv:2104.09456·eess.AS·June 15, 2021

Self-supervised Representation Learning With Path Integral Clustering For Speaker Diarization

Prachi Singh, Sriram Ganapathy

PDF

1 Repo

TL;DR

This paper introduces an iterative self-supervised clustering method for speaker diarization that combines deep representation learning with path integral clustering, significantly improving diarization accuracy on benchmark datasets.

Contribution

The paper presents a novel iterative self-supervised clustering algorithm that jointly optimizes speaker representations and clustering, outperforming existing methods.

Findings

01

13% DER improvement on CALLHOME

02

59% DER improvement on AMI

03

Outperforms recent diarization approaches

Abstract

Automatic speaker diarization techniques typically involve a two-stage processing approach where audio segments of fixed duration are converted to vector representations in the first stage. This is followed by an unsupervised clustering of the representations in the second stage. In most of the prior approaches, these two stages are performed in an isolated manner with independent optimization steps. In this paper, we propose a representation learning and clustering algorithm that can be iteratively performed for improved speaker diarization. The representation learning is based on principles of self-supervised learning while the clustering algorithm is a graph structural method based on path integral clustering (PIC). The representation learning step uses the cluster targets from PIC and the clustering step is performed on embeddings learned from the self-supervised deep model. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iiscleap/SSC
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.