Double Multi-Head Attention for Speaker Verification

Miquel India; Pooyan Safari; Javier Hernando

arXiv:2007.13199·eess.AS·January 12, 2021

Double Multi-Head Attention for Speaker Verification

Miquel India, Pooyan Safari, Javier Hernando

PDF

1 Repo

TL;DR

This paper introduces Double Multi-Head Attention pooling, an enhanced method for speaker verification that improves the discriminative power of speaker embeddings by adding an extra self-attention layer, leading to better performance on VoxCeleb2.

Contribution

The paper proposes a novel Double Multi-Head Attention pooling mechanism that extends previous self-attention approaches for more effective speaker embedding extraction.

Findings

01

Achieved 6.09% relative EER reduction over Self Attention pooling.

02

Achieved 5.23% relative EER reduction over Self Multi-Head Attention.

03

Demonstrated improved feature selection for CNN-based front-ends.

Abstract

Most state-of-the-art Deep Learning systems for speaker verification are based on speaker embedding extractors. These architectures are commonly composed of a feature extractor front-end together with a pooling layer to encode variable-length utterances into fixed-length speaker vectors. In this paper we present Double Multi-Head Attention pooling, which extends our previous approach based on Self Multi-Head Attention. An additional self attention layer is added to the pooling layer that summarizes the context vectors produced by Multi-Head Attention into a unique speaker representation. This method enhances the pooling mechanism by giving weights to the information captured for each head and it results in creating more discriminative speaker embeddings. We have evaluated our approach with the VoxCeleb2 dataset. Our results show 6.09% and 5.23% relative improvement in terms of EER…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

miquelindia90/DoubleAttentionSpeakerVerification
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.