Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards
Philipp Normann, Andreas Happe, J\"urgen Cito, Daniel Arp

TL;DR
This paper introduces a two-stage post-training method for small, local LLMs to effectively perform Linux privilege escalation, achieving high success rates with verifiable rewards and significantly reduced inference costs.
Contribution
It presents a novel two-stage post-training pipeline combining supervised fine-tuning and reinforcement learning for security tasks on small models.
Findings
Supervised fine-tuning more than doubles baseline success rate.
Reinforcement learning boosts success rate to 95.8%.
Inference cost per successful escalation reduced by over 100x.
Abstract
LLM agents are increasingly relevant to research domains such as vulnerability discovery. Yet, the strongest systems remain closed and cloud-only, making them resource-intensive, difficult to reproduce, and unsuitable for work involving proprietary code or sensitive data. Consequently, there is an urgent need for small, local models that can perform security tasks under strict resource budgets, but methods for developing them remain underexplored. In this paper, we address this gap by proposing a two-stage post-training pipeline. We focus on the problem of Linux privilege escalation, where success is automatically verifiable and the task requires multi-step interactive reasoning. Using an experimental setup that prevents data leakage, we post-train a 4B model in two stages: supervised fine-tuning on traces from procedurally generated privilege-escalation environments, followed by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Web Application Security Vulnerabilities · Information and Cyber Security
