Improving LLM Agents with Reinforcement Learning on Cryptographic CTF Challenges
Lajos Muzsai, David Imolai, Andr\'as Luk\'acs

TL;DR
This paper introduces 'Random-Crypto', a new cryptographic CTF dataset for training LLM agents with reinforcement learning, demonstrating improved reasoning, tool use, and generalization to external benchmarks in cybersecurity tasks.
Contribution
The paper presents a novel procedurally generated cryptographic dataset and fine-tunes a Llama-based agent with RL, achieving significant performance gains and better generalization in security-related reasoning tasks.
Findings
Significant improvement in Pass@8 on unseen challenges
Enhanced tool usage and procedural reasoning contribute to gains
Generalization to external crypto and non-crypto benchmarks
Abstract
We present 'Random-Crypto', a procedurally generated cryptographic Capture The Flag (CTF) dataset designed to unlock the potential of Reinforcement Learning (RL) for LLM-based agents in security-sensitive domains. Cryptographic reasoning offers an ideal RL testbed: it combines precise validation, structured multi-step inference, and reliance on reliable computational tool use. Leveraging these properties, we fine-tune a Python tool-augmented Llama-3.1-8B via Group Relative Policy Optimization (GRPO) in a secure execution environment. The resulting agent achieves a significant improvement in Pass@8 on previously unseen challenges. Moreover, the improvements generalize to two external benchmarks: 'picoCTF', spanning both crypto and non-crypto tasks, and 'AICrypto MCQ', a multiple-choice benchmark of 135 cryptography questions. Ablation studies attribute the gains to enhanced tool usage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlockchain Technology Applications and Security · Cryptography and Data Security
