An Explainable Probabilistic Attribute Embedding Approach for Spoofed Speech Characterization
Manasi Chhibber, Jagabandhu Mishra, Hyejin Shim, and Tomi H. Kinnunen

TL;DR
This paper introduces an explainable probabilistic attribute embedding method for spoofed speech detection and attribution, achieving comparable accuracy to raw embeddings while enhancing interpretability through decision trees and attribute analysis.
Contribution
The novel probabilistic attribute embeddings provide interpretability and maintain high performance in spoofing detection and attribution, with analysis of attribute importance.
Findings
Attribute embeddings achieve 99.7% accuracy in spoofing detection.
Attribute embeddings achieve 99.2% accuracy in attack attribution.
Important attributes include acoustic features, vocoder, and speaker modeling.
Abstract
We propose a novel approach for spoofed speech characterization through explainable probabilistic attribute embeddings. In contrast to high-dimensional raw embeddings extracted from a spoofing countermeasure (CM) whose dimensions are not easy to interpret, the probabilistic attributes are designed to gauge the presence or absence of sub-components that make up a specific spoofing attack. These attributes are then applied to two downstream tasks: spoofing detection and attack attribution. To enforce interpretability also to the back-end, we adopt a decision tree classifier. Our experiments on the ASVspoof2019 dataset with spoof CM embeddings extracted from three models (AASIST, Rawboost-AASIST, SSL-AASIST) suggest that the performance of the attribute embeddings are on par with the original raw spoof CM embeddings for both tasks. The best performance achieved with the proposed approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing
