ICLAD: In-Context Learning with Comparison-Guidance for Audio Deepfake Detection

Benjamin Chou; Yi Zhu; Surya Koppisetti

arXiv:2604.16749·cs.SD·April 21, 2026

ICLAD: In-Context Learning with Comparison-Guidance for Audio Deepfake Detection

Benjamin Chou, Yi Zhu, Surya Koppisetti

PDF

TL;DR

ICLAD introduces a novel in-context learning framework with comparison-guidance that leverages audio language models to improve generalization and interpretability in audio deepfake detection, especially on in-the-wild datasets.

Contribution

The paper proposes a training-free, comparison-guided in-context learning approach using audio language models for enhanced deepfake detection and interpretability.

Findings

01

ICLAD achieves up to 2x relative improvement in macro F1 score on in-the-wild datasets.

02

The framework enables training-free generalization to unseen deepfakes.

03

ICLAD provides textual rationales for detection outcomes.

Abstract

Audio deepfakes pose a significant security threat, yet current state-of-the-art (SOTA) detection systems do not generalize well to realistic in-the-wild deepfakes. We introduce a novel \textbf{I}n-\textbf{C}ontext \textbf{L}earning paradigm with comparison-guidance for \textbf{A}udio \textbf{D}eepfake detection (\textbf{ICLAD}). The framework enables the use of audio language models (ALMs) for training-free generalization to unseen deepfakes and provides textual rationales on the detection outcome. At the core of ICLAD is a pairwise comparative reasoning strategy that guides the ALM to discover and filter hallucinations and deepfake-irrelevant acoustic attributes. The ALM works alongside a specialized deepfake detector, whereby a routing mechanism feeds out-of-distribution samples to the ALM. On in-the-wild datasets, ICLAD improves macro F1 over the specialized detector, with up to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.