Bridging the Spoof Gap: A Unified Parallel Aggregation Network for Voice   Presentation Attacks

Awais Khan; Khalid Mahmood Malik

arXiv:2309.10560·cs.SD·September 20, 2023

Bridging the Spoof Gap: A Unified Parallel Aggregation Network for Voice Presentation Attacks

Awais Khan, Khalid Mahmood Malik

PDF

Open Access

TL;DR

This paper introduces a unified neural network architecture that effectively detects both logical and physical voice spoofing attacks, reducing error disparities and enhancing security in voice biometric systems.

Contribution

The paper proposes a novel Parallel Stacked Aggregation Network that processes raw audio for unified spoofing detection, addressing the gap between logical and physical attack detection methods.

Findings

01

Outperforms state-of-the-art solutions on ASVspoof-2019 and VSDC datasets.

02

Reduces disparities in Equal Error Rate between attack types.

03

Demonstrates superior generalizability and robustness in spoofing detection.

Abstract

Automatic Speaker Verification (ASV) systems are increasingly used in voice bio-metrics for user authentication but are susceptible to logical and physical spoofing attacks, posing security risks. Existing research mainly tackles logical or physical attacks separately, leading to a gap in unified spoofing detection. Moreover, when existing systems attempt to handle both types of attacks, they often exhibit significant disparities in the Equal Error Rate (EER). To bridge this gap, we present a Parallel Stacked Aggregation Network that processes raw audio. Our approach employs a split-transform-aggregation technique, dividing utterances into convolved representations, applying transformations, and aggregating the results to identify logical (LA) and physical (PA) spoofing attacks. Evaluation of the ASVspoof-2019 and VSDC datasets shows the effectiveness of the proposed system. It…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Voice and Speech Disorders · Speech and Audio Processing