End-to-End Diarization utilizing Attractor Deep Clustering

David Palzer; Matthew Maciejewski; Eric Fosler-Lussier

arXiv:2506.11090·cs.SD·June 16, 2025

End-to-End Diarization utilizing Attractor Deep Clustering

David Palzer, Matthew Maciejewski, Eric Fosler-Lussier

PDF

Open Access

TL;DR

This paper introduces a novel end-to-end speaker diarization framework that combines conformer decoders, transformer-updated attractors, and deep clustering techniques to improve speaker separation and robustness in varied conditions.

Contribution

It presents a compact, integrated diarization approach that enhances speaker representations and enforces structured embeddings through innovative deep clustering and orthogonality constraints.

Findings

01

Achieves low diarization error rates in experiments.

02

Maintains a parameter-efficient model.

03

Improves speaker separation robustness.

Abstract

Speaker diarization remains challenging due to the need for structured speaker representations, efficient modeling, and robustness to varying conditions. We propose a performant, compact diarization framework that integrates conformer decoders, transformer-updated attractors, and a deep clustering style angle loss. Our approach refines speaker representations with an enhanced conformer structure, incorporating cross-attention to attractors and an additional convolution module. To enforce structured embeddings, we extend deep clustering by constructing label-attractor vectors, aligning their directional structure with audio embeddings. We also impose orthogonality constraints on active attractors for better speaker separation while suppressing non-active attractors to prevent false activations. Finally, a permutation invariant training binary cross-entropy loss refines speaker detection.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Computational Techniques and Applications · Neural Networks and Applications