Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on Automatic Speech Recognition Systems
Hadi Abdullah, Aditya Karlekar, Saurabh Prasad, Muhammad Sajidur, Rahman, Logan Blue, Luke A. Bauer, Vincent Bindschaedler, Patrick Traynor

TL;DR
This paper develops a new, more robust audio CAPTCHA by analyzing recent speech-to-text attacks, creating a challenge that is highly resistant to automatic transcription while remaining understandable to humans.
Contribution
It introduces a novel audio CAPTCHA mechanism that leverages attack insights to significantly improve resistance against automated systems, balancing security and accessibility.
Findings
New CAPTCHA is four orders of magnitude more difficult to crack.
Proposed mechanism is highly intelligible to humans.
CAPTCHA effectively evades current speech-to-text systems.
Abstract
Audio CAPTCHAs are supposed to provide a strong defense for online resources; however, advances in speech-to-text mechanisms have rendered these defenses ineffective. Audio CAPTCHAs cannot simply be abandoned, as they are specifically named by the W3C as important enablers of accessibility. Accordingly, demonstrably more robust audio CAPTCHAs are important to the future of a secure and accessible Web. We look to recent literature on attacks on speech-to-text systems for inspiration for the construction of robust, principle-driven audio defenses. We begin by comparing 20 recent attack papers, classifying and measuring their suitability to serve as the basis of new "robust to transcription" but "easy for humans to understand" CAPTCHAs. After showing that none of these attacks alone are sufficient, we propose a new mechanism that is both comparatively intelligible (evaluated through a user…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsUser Authentication and Security Systems · Music and Audio Processing · Speech Recognition and Synthesis
