ASASVIcomtech: The Vicomtech-UGR Speech Deepfake Detection and SASV Systems for the ASVspoof5 Challenge
Juan M. Mart\'in-Do\~nas, Eros Rosell\'o, Angel M. Gomez and, Aitor \'Alvarez, Iv\'an L\'opez-Espejo, Antonio M. Peinado

TL;DR
This paper details the participation of the ASASVIcomtech team in the ASVspoof5 Challenge, exploring deepfake detection and speaker verification with various models, data analysis, and ensemble techniques, achieving competitive results.
Contribution
The paper introduces a comprehensive analysis of challenge data and explores multiple open-condition systems using self-supervised models and ensemble methods for spoofing detection and speaker verification.
Findings
Closed-condition system with deep complex convolutional recurrent architecture did not yield noteworthy results.
Open-condition systems leveraging self-supervised models showed promising performance.
Ensemble systems achieved very competitive results in both challenge tracks.
Abstract
This paper presents the work carried out by the ASASVIcomtech team, made up of researchers from Vicomtech and University of Granada, for the ASVspoof5 Challenge. The team has participated in both Track 1 (speech deepfake detection) and Track 2 (spoofing-aware speaker verification). This work started with an analysis of the challenge available data, which was regarded as an essential step to avoid later potential biases of the trained models, and whose main conclusions are presented here. With respect to the proposed approaches, a closed-condition system employing a deep complex convolutional recurrent architecture was developed for Track 1, although, unfortunately, no noteworthy results were achieved. On the other hand, different possibilities of open-condition systems, based on leveraging self-supervised models, augmented training data from previous challenges, and novel vocoders, were…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling
