Text-Dependent Speaker Verification (TdSV) Challenge 2024: Team Naive System Report

Amir Mohammad Rostami; Pourya Jafarzadeh

arXiv:2605.14896·cs.SD·May 15, 2026

Text-Dependent Speaker Verification (TdSV) Challenge 2024: Team Naive System Report

Amir Mohammad Rostami, Pourya Jafarzadeh

PDF

TL;DR

This paper reports a system for the 2024 TdSV Challenge that combines neural networks, data augmentation, and ensemble learning to achieve low error rates in speaker and phrase verification.

Contribution

The paper introduces a multi-model ensemble approach using adapted neural networks and a lightweight model trained on challenge data for improved verification performance.

Findings

01

Achieved MinDCF of 0.0461 and EER of 1.3% on the challenge.

02

Effective combination of neural architectures and data augmentation.

03

Ensemble learning enhances speaker and phrase verification accuracy.

Abstract

This paper presents a system for the 2024 Text-Dependent Speaker Verification (TdSV) Challenge. The system achieved a Minimum Detection Cost Function (MinDCF) of 0.0461 and an Equal Error Rate (EER) of 1.3\%. Our approach focused on adapting existing state-of-the-art neural networks, ResNet-TDNN and NeXt-TDNN, originally trained on the VoxCeleb dataset. This strategy was chosen because of the limited challenge duration and the available resources at the time. In addition, we designed a lightweight and resource-efficient model, EfficientNet-A0, trained specifically on the challenge dataset to improve adaptation and strengthen the ensemble approach. Our system combines advanced neural architectures, extensive data augmentation, and optimised hyperparameters. These components helped achieve strong performance in text-dependent speaker verification. The results also demonstrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.