CallShield: Secure Caller Authentication over Real-Time Audio Channels
Mouna Rabh, Yazan Boshmaf, Mashael Alsabah, Shammur Chowdhury, Mohamed Hefeeda, Issa Khalil

TL;DR
CallShield introduces a novel real-time audio-based caller authentication system using neural watermarking and a secure protocol, achieving over 99% success rate with minimal delay and high audio quality in telephony environments.
Contribution
It is the first system to authenticate callers entirely over the audio channel without relying on speech transcription or internet connectivity, using neural watermarking and a lightweight security protocol.
Findings
Achieves over 99.2% success rate on clean audio
Maintains high audio quality with PESQ > 4.2 and STOI > 0.94
Operates reliably under various channel distortions
Abstract
We present CallShield, the first caller identity authentication system that operates entirely at the audio layer, without relying on speech transcription, internet connectivity, or trusted infrastructure. CallShield introduces a real-time neural watermarking technique that enables per-bit embedding and recovery within 40-millisecond frames of live 8 kHz speech. This capability allows CallShield to transform the real-time audio channel into a noisy serial communication medium. To ensure reliable data transmission, CallShield implements a low-bitrate data link protocol that provides basic frame synchronization along with error detection, correction, and recovery. For caller authentication, CallShield adopts a secure and lightweight symmetric-key protocol that relies on pairwise shared secrets among trusted contacts. The system completes the full authentication process in an average of 63…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUser Authentication and Security Systems · Advanced Steganography and Watermarking Techniques · Speech Recognition and Synthesis
