HoliAntiSpoof: Audio LLM for Holistic Speech Anti-Spoofing

Xuenan Xu; Yiming Ren; Liwei Liu; Wen Wu; Baoxiang Li; Chaochao Lu; Shuai Wang; Chao Zhang

arXiv:2602.04535·cs.SD·February 5, 2026

HoliAntiSpoof: Audio LLM for Holistic Speech Anti-Spoofing

Xuenan Xu, Yiming Ren, Liwei Liu, Wen Wu, Baoxiang Li, Chaochao Lu, Shuai Wang, Chao Zhang

PDF

Open Access 1 Models

TL;DR

HoliAntiSpoof introduces a novel audio large language model framework for holistic speech anti-spoofing, enabling joint reasoning over spoofing techniques, speech attributes, and semantic impacts, with improved detection and interpretability.

Contribution

The paper presents the first ALLM framework for speech anti-spoofing that reformulates detection as a text generation task, integrating semantic analysis and introducing a new benchmark.

Findings

01

HoliAntiSpoof outperforms traditional baselines in multiple settings.

02

In-context learning improves out-of-domain generalization.

03

ALLMs enable interpretable analysis of spoofing behaviors.

Abstract

Recent advances in speech synthesis and editing have made speech spoofing increasingly challenging. However, most existing methods treat spoofing as binary classification, overlooking that diverse spoofing techniques manipulate multiple, coupled speech attributes and their semantic effects. In this paper, we introduce HoliAntiSpoof, the first audio large language model (ALLM) framework for holistic speech anti-spoofing analysis. HoliAntiSpoof reformulates spoofing analysis as a unified text generation task, enabling joint reasoning over spoofing methods, affected speech attributes, and their semantic impacts. To support semantic-level analysis, we introduce DailyTalkEdit, a new anti-spoofing benchmark that simulates realistic conversational manipulations and provides annotations of semantic influence. Extensive experiments demonstrate that HoliAntiSpoof outperforms conventional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
wsntxxn/HoliAntiSpoof
model· 53 dl
53 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders