An Efficient Temporary Deepfake Location Approach Based Embeddings for   Partially Spoofed Audio Detection

Yuankun Xie; Haonan Cheng; Yutian Wang; Long Ye

arXiv:2309.03036·cs.SD·November 22, 2023

An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio Detection

Yuankun Xie, Haonan Cheng, Yutian Wang, Long Ye

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel fine-grained approach called Temporal Deepfake Location (TDL) for detecting partially spoofed audio at the frame level, utilizing embedding similarity and temporal convolution to improve accuracy.

Contribution

The paper presents a new method combining embedding similarity and temporal convolution for precise frame-level spoof detection, outperforming existing models.

Findings

01

Outperforms baseline models on ASVspoof2019 Partial Spoof dataset.

02

Demonstrates superior cross-dataset detection performance.

03

Effective in identifying real and fake audio frames.

Abstract

Partially spoofed audio detection is a challenging task, lying in the need to accurately locate the authenticity of audio at the frame level. To address this issue, we propose a fine-grained partially spoofed audio detection method, namely Temporal Deepfake Location (TDL), which can effectively capture information of both features and locations. Specifically, our approach involves two novel parts: embedding similarity module and temporal convolution operation. To enhance the identification between the real and fake features, the embedding similarity module is designed to generate an embedding space that can separate the real frames from fake frames. To effectively concentrate on the position information, temporal convolution operation is proposed to calculate the frame-specific similarities among neighboring frames, and dynamically select informative neighbors to convolution. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xieyuankun/tdl-add
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Digital Media Forensic Detection · Speech and Audio Processing

MethodsConvolution