MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks
Zonglin Wu, Yule Xue, Yaoyao Feng, Xiaolong Wang, Yiren Song

TL;DR
MCA-Bench is a unified, multimodal benchmarking suite that evaluates CAPTCHA robustness against VLM-based attacks, providing insights into vulnerabilities and guiding future CAPTCHA design improvements.
Contribution
It introduces the first comprehensive, reproducible benchmark integrating diverse CAPTCHA types with a shared vision-language model backbone for cross-modal security assessment.
Findings
MCA-Bench effectively maps CAPTCHA vulnerabilities across attack scenarios.
The study reveals how challenge complexity and interaction depth affect solvability.
Provides actionable principles for designing more robust CAPTCHA systems.
Abstract
As automated attack techniques rapidly advance, CAPTCHAs remain a critical defense mechanism against malicious bots. However, existing CAPTCHA schemes encompass a diverse range of modalities -- from static distorted text and obfuscated images to interactive clicks, sliding puzzles, and logic-based questions -- yet the community still lacks a unified, large-scale, multimodal benchmark to rigorously evaluate their security robustness. To address this gap, we introduce MCA-Bench, a comprehensive and reproducible benchmarking suite that integrates heterogeneous CAPTCHA types into a single evaluation protocol. Leveraging a shared vision-language model backbone, we fine-tune specialized cracking agents for each CAPTCHA category, enabling consistent, cross-modal assessments. Extensive experiments reveal that MCA-Bench effectively maps the vulnerability spectrum of modern CAPTCHA designs under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsUser Authentication and Security Systems · Advanced Malware Detection Techniques · Spam and Phishing Detection
