Deep learning methods in speaker recognition: a review

D\'avid Sztah\'o; Gy\"orgy Szasz\'ak; Andr\'as Beke

arXiv:1911.06615·eess.AS·September 27, 2022

Deep learning methods in speaker recognition: a review

D\'avid Sztah\'o, Gy\"orgy Szasz\'ak, Andr\'as Beke

PDF

TL;DR

This review discusses how deep learning has become the leading approach in speaker recognition, replacing traditional methods with techniques like x-vectors, driven by increasing data availability and advancements in machine learning.

Contribution

It provides a comprehensive overview of deep learning applications in speaker recognition, highlighting the shift from traditional methods to DL-based solutions.

Findings

01

Deep learning now dominates speaker recognition methods.

02

x-vectors are the standard baseline in recent research.

03

Deep learning's effectiveness grows with more data.

Abstract

This paper summarizes the applied deep learning practices in the field of speaker recognition, both verification and identification. Speaker recognition has been a widely used field topic of speech technology. Many research works have been carried out and little progress has been achieved in the past 5-6 years. However, as deep learning techniques do advance in most machine learning fields, the former state-of-the-art methods are getting replaced by them in speaker recognition too. It seems that DL becomes the now state-of-the-art solution for both speaker verification and identification. The standard x-vectors, additional to i-vectors, are used as baseline in most of the novel works. The increasing amount of gathered data opens up the territory to DL, where they are the most effective.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.