CryptoScope: Utilizing Large Language Models for Automated Cryptographic Logic Vulnerability Detection

Zhihao Li; Zimo Ji; Tao Zheng; Hao Ren; Xiao Lan

arXiv:2508.11599·cs.CR·August 18, 2025

CryptoScope: Utilizing Large Language Models for Automated Cryptographic Logic Vulnerability Detection

Zhihao Li, Zimo Ji, Tao Zheng, Hao Ren, Xiao Lan

PDF

TL;DR

CryptoScope is an innovative framework that leverages Large Language Models with Chain-of-Thought prompting and retrieval techniques to automatically detect cryptographic logic vulnerabilities, significantly outperforming baseline models and uncovering new flaws.

Contribution

It introduces CryptoScope, combining LLMs with retrieval-augmented reasoning and a cryptographic knowledge base for automated vulnerability detection in cryptographic code.

Findings

01

CryptoScope improves detection accuracy over baseline LLMs by up to 28.69%.

02

It discovers 9 previously unknown cryptographic flaws in open-source projects.

03

The framework performs well across diverse programming languages and real-world cryptographic challenges.

Abstract

Cryptographic algorithms are fundamental to modern security, yet their implementations frequently harbor subtle logic flaws that are hard to detect. We introduce CryptoScope, a novel framework for automated cryptographic vulnerability detection powered by Large Language Models (LLMs). CryptoScope combines Chain-of-Thought (CoT) prompting with Retrieval-Augmented Generation (RAG), guided by a curated cryptographic knowledge base containing over 12,000 entries. We evaluate CryptoScope on LLM-CLVA, a benchmark of 92 cases primarily derived from real-world CVE vulnerabilities, complemented by cryptographic challenges from major Capture The Flag (CTF) competitions and synthetic examples across 11 programming languages. CryptoScope consistently improves performance over strong LLM baselines, boosting DeepSeek-V3 by 11.62%, GPT-4o-mini by 20.28%, and GLM-4-Flash by 28.69%. Additionally, it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.