Graph Attention Networks for Speaker Verification

Jee-weon Jung; Hee-Soo Heo; Ha-Jin Yu; Joon Son Chung

arXiv:2010.11543·eess.AS·February 9, 2021

Graph Attention Networks for Speaker Verification

Jee-weon Jung, Hee-Soo Heo, Ha-Jin Yu, Joon Son Chung

PDF

TL;DR

This paper introduces a graph attention network-based framework for speaker verification that models segment embeddings as graph nodes, achieving significant accuracy improvements over traditional methods.

Contribution

The novel use of graph attention networks to interpret segment-wise speaker embeddings as graphs for improved speaker verification accuracy.

Findings

01

Achieved an average 20% reduction in equal error rate over cosine similarity baseline.

02

Validated effectiveness across three different speaker embedding extractors.

03

Demonstrated consistent performance improvements with the proposed framework.

Abstract

This work presents a novel back-end framework for speaker verification using graph attention networks. Segment-wise speaker embeddings extracted from multiple crops within an utterance are interpreted as node representations of a graph. The proposed framework inputs segment-wise speaker embeddings from an enrollment and a test utterance and directly outputs a similarity score. We first construct a graph using segment-wise speaker embeddings and then input these to graph attention networks. After a few graph attention layers with residual connections, each node is projected into a one-dimensional space using affine transform, followed by a readout operation resulting in a scalar similarity score. To enable successful adaptation for speaker verification, we propose techniques such as separating trainable weights for attention map calculations between segment-wise speaker embeddings from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.