Quantum Kernels for Audio Deepfake Detection Using Spectrogram Patch Features

Lisan Al Amin; Rakib Hossain; Mahbubul Islam; Faisal Quader; and Thanh Thi Nguyen

arXiv:2605.06035·cs.SD·May 8, 2026

Quantum Kernels for Audio Deepfake Detection Using Spectrogram Patch Features

Lisan Al Amin, Rakib Hossain, Mahbubul Islam, Faisal Quader, and Thanh Thi Nguyen

PDF

TL;DR

This paper introduces Q-Patch, a quantum feature map for audio spectrograms, improving deepfake detection by leveraging time-frequency structures with practical quantum circuits.

Contribution

Q-Patch is a novel quantum kernel method that encodes local spectrogram patches into quantum states, enhancing audio deepfake detection in near-term quantum devices.

Findings

01

Q-Patch achieves an AUROC of 0.87 in audio spoofing detection.

02

It outperforms classical RBF-SVM baseline with AUROC of 0.82.

03

Kernel analysis shows clear class separation and similarity metrics.

Abstract

Quantum machine learning has emerged as a promising tool for pattern recognition, yet many audio-focused approaches still treat spectrograms as generic images and do not explicitly exploit their time-frequency structure. We propose Q-Patch, a quantum feature map tailored to audio that encodes local time-frequency patches from mel-spectrograms into quantum states using shallow, hardware-efficient circuits with adjacency-aware entanglement. Each selected patch is summarized by a compact four-dimensional acoustic descriptor and mapped to a four-qubit circuit with depth at most three, enabling practical quantum kernel construction under near-term constraints. We evaluate Q-Patch on an audio spoofing detection task using a controlled, balanced protocol and compare it with size-matched classical baselines. Q-Patch improves discrimination between bona fide and spoofed samples, achieving an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.