Explainable Attribute-Based Speaker Verification

Xiaoliang Wu; Chau Luu; Peter Bell; Ajitha Rajan

arXiv:2405.19796·cs.SD·May 31, 2024·2 cites

Explainable Attribute-Based Speaker Verification

Xiaoliang Wu, Chau Luu, Peter Bell, Ajitha Rajan

PDF

Open Access

TL;DR

This paper introduces an explainable speaker verification system that uses automatically extracted personal attributes like gender, nationality, and age to identify speakers, aiming for transparency and human-like reasoning.

Contribution

It presents a novel attribute-based approach to speaker verification that enhances interpretability while maintaining competitive performance.

Findings

01

Comparable performance to ground truth with correct attributes

02

System sacrifices some accuracy for explainability

03

Lays groundwork for future attribute expansion

Abstract

This paper proposes a fully explainable approach to speaker verification (SV), a task that fundamentally relies on individual speaker characteristics. The opaque use of speaker attributes in current SV systems raises concerns of trust. Addressing this, we propose an attribute-based explainable SV system that identifies speakers by comparing personal attributes such as gender, nationality, and age extracted automatically from voice recordings. We believe this approach better aligns with human reasoning, making it more understandable than traditional methods. Evaluated on the Voxceleb1 test set, the best performance of our system is comparable with the ground truth established when using all correct attributes, proving its efficacy. Whilst our approach sacrifices some performance compared to non-explainable methods, we believe that it moves us closer to the goal of transparent,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques