Exploring Vision-Language Models for Online Signature Verification: A Zero-Shot Capability Study
Marta Robledo-Moreno, Ruben Vera-Rodriguez, Ruben Tolosana, Javier Ortega-Garcia

TL;DR
This study evaluates the zero-shot performance of advanced Vision-Language Models on online signature verification, revealing strengths in random forgery detection and challenges with skilled forgeries and reasoning artifacts.
Contribution
It introduces a novel protocol for biometric scoring using VLMs and provides the first exploration of their zero-shot capabilities in signature verification tasks.
Findings
GPT-5.2 achieves 0.32% EER in mobile tasks for random forgeries.
VLMs outperform supervised systems in random forgery scenarios.
Performance drops significantly in skilled forgery scenarios due to reasoning issues.
Abstract
Recent advancements in Vision-Language Models (VLMs) have demonstrated strong capabilities in general visual reasoning, yet their applicability to rigorous biometric tasks remains unexplored. This work presents an exploratory study evaluating the zero-shot performance of state-of-the-art VLMs (GPT-5.2 and Gemini 2.5 Pro) on the Signature Verification Challenge (SVC) benchmark. To enable visual processing, raw kinematic time-series are converted into static images, encoding pressure information into stroke opacity whenever available in the source data. Furthermore, we introduce a scoring protocol that extracts latent token probabilities to compute robust biometric scores. Experimental results reveal a significant performance dichotomy dependent on signal quality and forgery type. In random forgery scenarios, the zero-shot VLM achieves exceptional discrimination, with GPT-5.2 reaching an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
