ELEAT-SAGA: Early & Late Integration with Evading Alternating Training for Spoof-Robust Speaker Verification

Amro Asali; Yehuda Ben-Shimol; Itshak Lapidot

arXiv:2602.13761·eess.AS·February 17, 2026

ELEAT-SAGA: Early & Late Integration with Evading Alternating Training for Spoof-Robust Speaker Verification

Amro Asali, Yehuda Ben-Shimol, Itshak Lapidot

PDF

Open Access

TL;DR

This paper introduces a novel SASV architecture called SASV-SAGA that uses score-aware gated attention and alternating training strategies to improve robustness against spoofing attacks in speaker verification systems.

Contribution

The paper proposes a new SASV model with score-aware gated attention and introduces evading alternating training for better spoofing robustness.

Findings

01

Achieved SASV-EER of 1.22% on ASVspoof 2019 dataset.

02

Significant improvements over baseline methods.

03

Validated effectiveness of attention mechanisms and training strategies.

Abstract

Spoofing-robust automatic speaker verification (SASV) seeks to build automatic speaker verification systems that are robust against both zero-effort impostor attacks and sophisticated spoofing techniques such as voice conversion (VC) and text-to-speech (TTS). In this work, we propose a novel SASV architecture that introduces score-aware gated attention (SAGA), SASV-SAGA, enabling dynamic modulation of speaker embeddings based on countermeasure (CM) scores. By integrating speaker embeddings and CM scores from pre-trained ECAPA-TDNN and AASIST models respectively, we explore several integration strategies including early, late, and full integration. We further introduce alternating training for multi-module (ATMM) and a refined variant, evading alternating training (EAT). Experimental results on the ASVspoof 2019 Logical Access (LA) and Spoofceleb datasets demonstrate significant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Adversarial Robustness in Machine Learning · Speech and Audio Processing