A Noval Feature via Color Quantisation for Fake Audio Detection

Zhiyong Wang; Xiaopeng Wang; Yuankun Xie; Ruibo Fu; Zhengqi Wen,; Jianhua Tao; Yukun Liu; Guanjun Li; Xin Qi; Yi Lu; Xuefei Liu; Yongwei Li

arXiv:2408.10849·cs.SD·August 21, 2024

A Noval Feature via Color Quantisation for Fake Audio Detection

Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Ruibo Fu, Zhengqi Wen,, Jianhua Tao, Yukun Liu, Guanjun Li, Xin Qi, Yi Lu, Xuefei Liu, Yongwei Li

PDF

Open Access

TL;DR

This paper introduces a novel feature extraction technique using color quantisation for fake audio detection, improving interpretability and classification performance over traditional spectral methods.

Contribution

It proposes a new color quantisation-based feature extraction method that enhances interpretability and detection accuracy in deepfake audio identification.

Findings

01

Outperforms original spectral input in classification accuracy

02

Pretraining the recolor network improves fake audio detection

03

Method provides intuitive visualization of focus areas in spectral reconstruction

Abstract

In the field of deepfake detection, previous studies focus on using reconstruction or mask and prediction methods to train pre-trained models, which are then transferred to fake audio detection training where the encoder is used to extract features, such as wav2vec2.0 and Masked Auto Encoder. These methods have proven that using real audio for reconstruction pre-training can better help the model distinguish fake audio. However, the disadvantage lies in poor interpretability, meaning it is hard to intuitively present the differences between deepfake and real audio. This paper proposes a noval feature extraction method via color quantisation which constrains the reconstruction to use a limited number of colors for the spectral image-like input. The proposed method ensures reconstructed input differs from the original, which allows for intuitive observation of the focus areas in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Music and Audio Processing · Music Technology and Sound Studies