PITCH: AI-assisted Tagging of Deepfake Audio Calls using Challenge-Response
Govind Mittal, Arthur Jakobsson, Kelly O. Marshall, Chinmay Hegde, Nasir Memon

TL;DR
PITCH is a novel AI-assisted challenge-response system designed to detect and tag real-time deepfake audio calls, significantly improving detection accuracy by combining machine analysis with human judgment to combat voice-cloning threats.
Contribution
This work introduces a comprehensive taxonomy of audio challenges, a large dataset for testing, and a hybrid human-AI system that enhances deepfake audio detection in real-time calls.
Findings
Machine detection achieved 88.7% AUROC score.
Humans independently achieved 72.6% accuracy on challenging calls.
The hybrid system improved detection accuracy to 84.5%.
Abstract
The rise of AI voice-cloning technology, particularly audio Real-time Deepfakes (RTDFs), has intensified social engineering attacks by enabling real-time voice impersonation that bypasses conventional enrollment-based authentication. This technology represents an existential threat to phone-based authentication systems, while total identity fraud losses reached $43 billion. Unlike traditional robocalls, these personalized AI-generated voice attacks target high-value accounts and circumvent existing defensive measures, creating an urgent cybersecurity challenge. To address this, we propose PITCH, a robust challenge-response method to detect and tag interactive deepfake audio calls. We developed a comprehensive taxonomy of audio challenges based on the human auditory system, linguistics, and environmental factors, yielding 20 prospective challenges. Testing against leading voice-cloning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing
