Codec-Based Deepfake Source Tracing via Neural Audio Codec Taxonomy

Xuanjun Chen; I-Ming Lin; Lin Zhang; Jiawei Du; Haibin Wu; Hung-yi Lee; Jyh-Shing Roger Jang

arXiv:2505.12994·cs.SD·August 5, 2025

Codec-Based Deepfake Source Tracing via Neural Audio Codec Taxonomy

Xuanjun Chen, I-Ming Lin, Lin Zhang, Jiawei Du, Haibin Wu, Hung-yi Lee, Jyh-Shing Roger Jang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method for tracing the source of neural audio codec-based deepfake speech by analyzing the taxonomy of neural audio codecs, providing initial promising results and highlighting future challenges.

Contribution

It presents a novel approach to trace the source of CodecFake deepfakes using neural audio codec taxonomy, addressing a gap in existing anti-spoofing research.

Findings

01

Initial evidence shows feasibility of CodecFake source tracing

02

Experimental results on CodecFake+ dataset support the approach

03

Highlights challenges for future research in source tracing

Abstract

Recent advances in neural audio codec-based speech generation (CoSG) models have produced remarkably realistic audio deepfakes. We refer to deepfake speech generated by CoSG systems as codec-based deepfake, or CodecFake. Although existing anti-spoofing research on CodecFake predominantly focuses on verifying the authenticity of audio samples, almost no attention was given to tracing the CoSG used in generating these deepfakes. In CodecFake generation, processes such as speech-to-unit encoding, discrete unit modeling, and unit-to-speech decoding are fundamentally based on neural audio codecs. Motivated by this, we introduce source tracing for CodecFake via neural audio codec taxonomy, which dissects neural audio codecs to trace CoSG. Our experimental results on the CodecFake+ dataset provide promising initial evidence for the feasibility of CodecFake source tracing while also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

responsiblegenai/codecfake-source-tracing
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Generative Adversarial Networks and Image Synthesis · Speech and Audio Processing