CryptoFormalEval: Integrating LLMs and Formal Verification for Automated Cryptographic Protocol Vulnerability Detection
Cristian Curaba, Denis D'Ambrosi, Alessandro Minisini, Natalia, P\'erez-Campanero Antol\'in

TL;DR
This paper introduces a benchmark to evaluate how well large language models can autonomously detect vulnerabilities in cryptographic protocols by interacting with a formal verification tool, aiming to enhance automated cybersecurity analysis.
Contribution
It presents a novel benchmark and dataset for assessing LLMs in cryptographic vulnerability detection, integrating AI with formal verification tools like Tamarin.
Findings
Current frontier models show promising performance in vulnerability detection.
The benchmark reveals strengths and limitations of LLMs in formal protocol analysis.
Insights into combining LLMs with symbolic reasoning for cybersecurity applications.
Abstract
Cryptographic protocols play a fundamental role in securing modern digital infrastructure, but they are often deployed without prior formal verification. This could lead to the adoption of distributed systems vulnerable to attack vectors. Formal verification methods, on the other hand, require complex and time-consuming techniques that lack automatization. In this paper, we introduce a benchmark to assess the ability of Large Language Models (LLMs) to autonomously identify vulnerabilities in new cryptographic protocols through interaction with Tamarin: a theorem prover for protocol verification. We created a manually validated dataset of novel, flawed, communication protocols and designed a method to automatically verify the vulnerabilities found by the AI agents. Our results about the performances of the current frontier models on the benchmark provides insights about the possibility…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Authentication Protocols Security · Cryptographic Implementations and Security · Network Security and Intrusion Detection
