Sequence-level Speaker Change Detection with Difference-based Continuous   Integrate-and-fire

Zhiyun Fan; Linhao Dong; Meng Cai; Zejun Ma; and Bo Xu

arXiv:2206.13110·cs.SD·June 28, 2022

Sequence-level Speaker Change Detection with Difference-based Continuous Integrate-and-fire

Zhiyun Fan, Linhao Dong, Meng Cai, Zejun Ma, and Bo Xu

PDF

1 Repo

TL;DR

This paper introduces a sequence transduction framework with a novel difference-based integrate-and-fire mechanism for speaker change detection, outperforming frame-level baselines on AMI and DIHARD-I datasets.

Contribution

It presents a new encoder-decoder approach with a difference-based integrate-and-fire mechanism for sequence-level speaker change detection, using weaker supervision.

Findings

01

Outperforms frame-level baseline methods.

02

Effective on AMI and DIHARD-I corpora.

03

Supports sequence-level supervision for speaker change detection.

Abstract

Speaker change detection is an important task in multi-party interactions such as meetings and conversations. In this paper, we address the speaker change detection task from the perspective of sequence transduction. Specifically, we propose a novel encoder-decoder framework that directly converts the input feature sequence to the speaker identity sequence. The difference-based continuous integrate-and-fire mechanism is designed to support this framework. It detects speaker changes by integrating the speaker difference between the encoder outputs frame-by-frame and transfers encoder outputs to segment-level speaker embeddings according to the detected speaker changes. The whole framework is supervised by the speaker identity sequence, a weaker label than the precise speaker change points. The experiments on the AMI and DIHARD-I corpora show that our sequence-level method consistently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhiyunfan/seq-scd
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.