Tongji University Undergraduate Team for the VoxCeleb Speaker   Recognition Challenge2020

Shufan Shen; Ran Miao; Yi Wang; Zhihua Wei

arXiv:2010.10145·cs.SD·October 21, 2020

Tongji University Undergraduate Team for the VoxCeleb Speaker Recognition Challenge2020

Shufan Shen, Ran Miao, Yi Wang, Zhihua Wei

PDF

Open Access

TL;DR

This paper describes Tongji University's undergraduate team's submission to the VoxCeleb Speaker Recognition Challenge 2020, utilizing an enhanced ResNet34 model with denoising modules, data augmentation, and score fusion to improve speaker verification accuracy.

Contribution

The team applied the RSBU-CW denoising module to ResNet34 and used data augmentation and score fusion, achieving competitive results in the VoxCeleb challenge.

Findings

01

Achieved 0.2973 DCF and 4.97% EER on the challenge evaluation set.

02

Enhanced ResNet34 with RSBU-CW improves denoising in speaker recognition.

03

Fusion of two models boosts overall performance.

Abstract

In this report, we discribe the submission of Tongji University undergraduate team to the CLOSE track of the VoxCeleb Speaker Recognition Challenge (VoxSRC) 2020 at Interspeech 2020. We applied the RSBU-CW module to the ResNet34 framework to improve the denoising ability of the network and better complete the speaker verification task in a complex environment.We trained two variants of ResNet,used score fusion and data-augmentation methods to improve the performance of the model. Our fusion of two selected systems for the CLOSE track achieves 0.2973 DCF and 4.9700\% EER on the challenge evaluation set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing

Methods1x1 Convolution · Average Pooling · Batch Normalization · Residual Connection · Residual Block · *Communicated@Fast*How Do I Communicate to Expedia? · Bottleneck Residual Block · Max Pooling · Convolution · Kaiming Initialization