An Explainable Probabilistic Attribute Embedding Approach for Spoofed   Speech Characterization

Manasi Chhibber; Jagabandhu Mishra; Hyejin Shim; and Tomi H. Kinnunen

arXiv:2409.11027·eess.AS·September 18, 2024

An Explainable Probabilistic Attribute Embedding Approach for Spoofed Speech Characterization

Manasi Chhibber, Jagabandhu Mishra, Hyejin Shim, and Tomi H. Kinnunen

PDF

Open Access

TL;DR

This paper introduces an explainable probabilistic attribute embedding method for spoofed speech detection and attribution, achieving comparable accuracy to raw embeddings while enhancing interpretability through decision trees and attribute analysis.

Contribution

The novel probabilistic attribute embeddings provide interpretability and maintain high performance in spoofing detection and attribution, with analysis of attribute importance.

Findings

01

Attribute embeddings achieve 99.7% accuracy in spoofing detection.

02

Attribute embeddings achieve 99.2% accuracy in attack attribution.

03

Important attributes include acoustic features, vocoder, and speaker modeling.

Abstract

We propose a novel approach for spoofed speech characterization through explainable probabilistic attribute embeddings. In contrast to high-dimensional raw embeddings extracted from a spoofing countermeasure (CM) whose dimensions are not easy to interpret, the probabilistic attributes are designed to gauge the presence or absence of sub-components that make up a specific spoofing attack. These attributes are then applied to two downstream tasks: spoofing detection and attack attribution. To enforce interpretability also to the back-end, we adopt a decision tree classifier. Our experiments on the ASVspoof2019 dataset with spoof CM embeddings extracted from three models (AASIST, Rawboost-AASIST, SSL-AASIST) suggest that the performance of the attribute embeddings are on par with the original raw spoof CM embeddings for both tasks. The best performance achieved with the proposed approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing