SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

Ibragim Badertdinov; Maksim Nekrashevich; Anton Shevtsov; Alexander Golubev

arXiv:2602.23866·cs.SE·March 2, 2026

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

Ibragim Badertdinov, Maksim Nekrashevich, Anton Shevtsov, Alexander Golubev

PDF

Open Access 2 Datasets

TL;DR

SWE-rebench V2 introduces a large-scale, language-agnostic dataset of over 32,000 real-world software engineering tasks with reproducible environments, enabling improved training of SWE agents across diverse programming languages.

Contribution

It presents an automated pipeline for harvesting and validating large-scale SWE tasks across multiple languages, significantly expanding available training data for SWE agents.

Findings

01

Constructed a dataset of 32,000+ tasks across 20 languages.

02

Validated dataset quality through diagnostic studies with multiple models.

03

Released 120,000+ tasks with rich metadata for training and benchmarking.

Abstract

Software engineering agents (SWE) are improving rapidly, with recent gains largely driven by reinforcement learning (RL). However, RL training is constrained by the scarcity of large-scale task collections with reproducible execution environments and reliable test suites. Although a growing number of benchmarks have emerged, datasets suitable for training remain limited in scale and diversity or often target a limited set of high-resource language ecosystems. We introduce SWE-rebench V2, a language-agnostic automated pipeline for harvesting executable real-world SWE tasks and constructing RL training environments at scale. The pipeline synthesizes repository-specific installation and test procedures via an interactive setup agent, and filters unsound instances using an ensemble of LLM judges, validated against human-verified SWE-bench annotations. Using this pipeline, we construct a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Reinforcement Learning in Robotics