SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Yuxiang Wei; Olivier Duchenne; Jade Copet; Quentin Carbonneaux; Lingming Zhang; Daniel Fried; Gabriel Synnaeve; Rishabh Singh; Sida I. Wang

arXiv:2502.18449·cs.SE·December 2, 2025·2 cites

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Yuxiang Wei, Olivier Duchenne, Jade Copet, Quentin Carbonneaux, Lingming Zhang, Daniel Fried, Gabriel Synnaeve, Rishabh Singh, Sida I. Wang

PDF

Open Access

TL;DR

This paper introduces SWE-RL, a reinforcement learning approach that enhances large language models' reasoning by training on extensive open-source software evolution data, achieving state-of-the-art results on real-world software engineering tasks.

Contribution

SWE-RL is the first method to apply RL-based reasoning to real-world software engineering data, improving LLM performance on practical software tasks.

Findings

01

Achieves 41.0% solve rate on SWE-bench Verified.

02

Outperforms supervised fine-tuning baseline on out-of-domain tasks.

03

Emerges with generalized reasoning skills from software evolution data.

Abstract

The recent DeepSeek-R1 release has demonstrated the immense potential of reinforcement learning (RL) in enhancing the general reasoning capabilities of large language models (LLMs). While DeepSeek-R1 and other follow-up work primarily focus on applying RL to competitive coding and math problems, this paper introduces SWE-RL, the first approach to scale RL-based LLM reasoning for real-world software engineering. Leveraging a lightweight rule-based reward (e.g., the similarity score between ground-truth and LLM-generated solutions), SWE-RL enables LLMs to autonomously recover a developer's reasoning processes and solutions by learning from extensive open-source software evolution data -- the record of a software's entire lifecycle, including its code snapshots, code changes, and events such as issues and pull requests. Trained on top of Llama 3, our resulting reasoning model,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Multi-Agent Systems and Negotiation

MethodsLib · Focus · LLaMA