Hybrid-Gym: Training Coding Agents to Generalize Across Tasks

Yiqing Xie; Emmy Liu; Gaokai Zhang; Nachiket Kotalwar; Shubham Gandhi; Sathwik Acharya; Xingyao Wang; Carolyn Rose; Graham Neubig; Daniel Fried

arXiv:2602.16819·cs.SE·February 20, 2026

Hybrid-Gym: Training Coding Agents to Generalize Across Tasks

Yiqing Xie, Emmy Liu, Gaokai Zhang, Nachiket Kotalwar, Shubham Gandhi, Sathwik Acharya, Xingyao Wang, Carolyn Rose, Graham Neubig, Daniel Fried

PDF

Open Access

TL;DR

Hybrid-Gym introduces a scalable training environment with synthetic tasks that teach transferable coding skills, enabling models to generalize better across diverse real-world programming challenges.

Contribution

The paper proposes Hybrid-Gym, a set of synthetic tasks designed to improve coding agents' ability to generalize across various complex programming tasks.

Findings

01

25.4% absolute gain on SWE-Bench Verified

02

7.9% improvement on SWT-Bench Verified

03

5.1% increase on Commit-0 Lite

Abstract

When assessing the quality of coding agents, predominant benchmarks focus on solving single issues on GitHub, such as SWE-Bench. In contrast, in real use, these agents solve more various and complex tasks that involve other skills such as exploring codebases, testing software, and designing architecture. In this paper, we first characterize some transferable skills that are shared across diverse tasks by decomposing trajectories into fine-grained components, and derive a set of principles for designing auxiliary training tasks to teach language models these skills. Guided by these principles, we propose a training environment, Hybrid-Gym, consisting of a set of scalable synthetic tasks, such as function localization and dependency search. Experiments show that agents trained on our synthetic tasks effectively generalize to diverse real-world tasks that are not present in training,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Machine Learning and Algorithms · Machine Learning and Data Classification