Improving LLM Agents with Reinforcement Learning on Cryptographic CTF Challenges

Lajos Muzsai; David Imolai; Andr\'as Luk\'acs

arXiv:2506.02048·cs.CR·August 19, 2025

Improving LLM Agents with Reinforcement Learning on Cryptographic CTF Challenges

Lajos Muzsai, David Imolai, Andr\'as Luk\'acs

PDF

Open Access

TL;DR

This paper introduces 'Random-Crypto', a new cryptographic CTF dataset for training LLM agents with reinforcement learning, demonstrating improved reasoning, tool use, and generalization to external benchmarks in cybersecurity tasks.

Contribution

The paper presents a novel procedurally generated cryptographic dataset and fine-tunes a Llama-based agent with RL, achieving significant performance gains and better generalization in security-related reasoning tasks.

Findings

01

Significant improvement in Pass@8 on unseen challenges

02

Enhanced tool usage and procedural reasoning contribute to gains

03

Generalization to external crypto and non-crypto benchmarks

Abstract

We present 'Random-Crypto', a procedurally generated cryptographic Capture The Flag (CTF) dataset designed to unlock the potential of Reinforcement Learning (RL) for LLM-based agents in security-sensitive domains. Cryptographic reasoning offers an ideal RL testbed: it combines precise validation, structured multi-step inference, and reliance on reliable computational tool use. Leveraging these properties, we fine-tune a Python tool-augmented Llama-3.1-8B via Group Relative Policy Optimization (GRPO) in a secure execution environment. The resulting agent achieves a significant improvement in Pass@8 on previously unseen challenges. Moreover, the improvements generalize to two external benchmarks: 'picoCTF', spanning both crypto and non-crypto tasks, and 'AICrypto MCQ', a multiple-choice benchmark of 135 cryptography questions. Ablation studies attribute the gains to enhanced tool usage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBlockchain Technology Applications and Security · Cryptography and Data Security