Quantum Kernels for Audio Deepfake Detection Using Spectrogram Patch Features
Lisan Al Amin, Rakib Hossain, Mahbubul Islam, Faisal Quader, and Thanh Thi Nguyen

TL;DR
This paper introduces Q-Patch, a quantum feature map for audio spectrograms, improving deepfake detection by leveraging time-frequency structures with practical quantum circuits.
Contribution
Q-Patch is a novel quantum kernel method that encodes local spectrogram patches into quantum states, enhancing audio deepfake detection in near-term quantum devices.
Findings
Q-Patch achieves an AUROC of 0.87 in audio spoofing detection.
It outperforms classical RBF-SVM baseline with AUROC of 0.82.
Kernel analysis shows clear class separation and similarity metrics.
Abstract
Quantum machine learning has emerged as a promising tool for pattern recognition, yet many audio-focused approaches still treat spectrograms as generic images and do not explicitly exploit their time-frequency structure. We propose Q-Patch, a quantum feature map tailored to audio that encodes local time-frequency patches from mel-spectrograms into quantum states using shallow, hardware-efficient circuits with adjacency-aware entanglement. Each selected patch is summarized by a compact four-dimensional acoustic descriptor and mapped to a four-qubit circuit with depth at most three, enabling practical quantum kernel construction under near-term constraints. We evaluate Q-Patch on an audio spoofing detection task using a controlled, balanced protocol and compare it with size-matched classical baselines. Q-Patch improves discrimination between bona fide and spoofed samples, achieving an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
