Immersion in the GitHub Universe: Scaling Coding Agents to Mastery
Jiale Zhao, Guoxin Chen, Fanzhe Meng, Minghao Li, Jie Chen, Hui Xu, Yongshuai Sun, Wayne Xin Zhao, Ruihua Song, Yuan Zhang, Peng Wang, Cheng Chen, Jirong Wen, Kai Jia

TL;DR
This paper introduces ScaleSWE, an automated system that constructs a large, diverse dataset of software engineering tasks from GitHub, enabling significant improvements in training coding agents for real-world software engineering.
Contribution
The paper presents ScaleSWE, a novel automated multi-agent workflow that generates the largest high-quality SWE dataset to date, facilitating advancements in LLM-based software engineering.
Findings
Created 100k verified SWE instances from 6 million pull requests
Finetuned Qwen30BA3BInstruct achieving a 64% resolve rate on SWE benchmarks
Surpassed existing datasets in diversity and task complexity
Abstract
Achieving mastery in real world software engineering tasks is fundamentally bottlenecked by the scarcity of large scale, high quality training data. Scaling such data has been limited by the complexity of environment setup, unit test generation, and problem statement curation. In this paper, we propose ScaleSWE, an automated, sandboxed multi agent workflow designed to construct high quality SWE data at scale. The system coordinates three specialized agents for environment setup, test creation, and problem description synthesis to process 6 million pull requests across 5200 repositories, producing Scale SWE Data: 100k verified SWE instances, the largest such dataset to date. It substantially surpasses existing real world datasets in repository diversity and reflects realistic task complexity. We further demonstrate the dataset utility for training by distilling 71498 high quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Scientific Computing and Data Management
