Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake   Detection

Yassine El Kheir; Youness Samih; Suraj Maharjan; Tim Polzehl; and; Sebastian M\"oller

arXiv:2502.03559·eess.AS·February 10, 2025

Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection

Yassine El Kheir, Youness Samih, Suraj Maharjan, Tim Polzehl, and, Sebastian M\"oller

PDF

Open Access 1 Repo 1 Video

TL;DR

This study provides a detailed layer-wise analysis of SSL models for audio deepfake detection, revealing that lower transformer layers are most effective and enabling reduced computational costs without sacrificing performance.

Contribution

It offers the first comprehensive layer-wise analysis of SSL models in audio deepfake detection, highlighting the importance of lower layers and demonstrating efficient model configurations.

Findings

01

Lower layers are most discriminative for deepfake detection.

02

Models maintain competitive EER scores with fewer layers.

03

Using only a few lower layers reduces computational costs.

Abstract

This paper conducts a comprehensive layer-wise analysis of self-supervised learning (SSL) models for audio deepfake detection across diverse contexts, including multilingual datasets (English, Chinese, Spanish), partial, song, and scene-based deepfake scenarios. By systematically evaluating the contributions of different transformer layers, we uncover critical insights into model behavior and performance. Our findings reveal that lower layers consistently provide the most discriminative features, while higher layers capture less relevant information. Notably, all models achieve competitive equal error rate (EER) scores even when employing a reduced number of layers. This indicates that we can reduce computational costs and increase the inference speed of detecting deepfakes by utilizing only a few lower layers. This work enhances our understanding of SSL models in deepfake detection,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yaselley/ssl_layerwise_deepfake
pytorchOfficial

Videos

Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection· underline

Taxonomy

TopicsDigital Media Forensic Detection · Handwritten Text Recognition Techniques · Speech Recognition and Synthesis

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings