Parallel-SFT: Improving Zero-Shot Cross-Programming-Language Transfer for Code RL

Zhaofeng Wu; Shiqi Wang; Boya Peng; Anuj Goyal; Melanie Kambadur; Sebastian Ruder; Yoon Kim; Chloe Bi

arXiv:2604.20835·cs.CL·April 24, 2026

Parallel-SFT: Improving Zero-Shot Cross-Programming-Language Transfer for Code RL

Zhaofeng Wu, Shiqi Wang, Boya Peng, Anuj Goyal, Melanie Kambadur, Sebastian Ruder, Yoon Kim, Chloe Bi

PDF

TL;DR

This paper introduces Parallel-SFT, a strategy that uses parallel programs across multiple languages to enhance zero-shot transfer in code RL, leading to better generalization across programming languages.

Contribution

Proposes Parallel-SFT, a novel SFT approach utilizing parallel programs to improve cross-language transferability in code RL models.

Findings

01

Parallel-SFT improves transferability to unseen programming languages.

02

RL training on Parallel-SFT models enhances generalization performance.

03

Latent space analysis shows more functionality-centric representations with Parallel-SFT.

Abstract

Modern language models demonstrate impressive coding capabilities in common programming languages (PLs), such as C++ and Python, but their performance in lower-resource PLs is often limited by training data availability. In principle, however, most programming skills are universal across PLs, so the capability acquired in one PL should transfer to others. In this work, we propose the task of zero-shot cross-programming-language transfer for code RL. We find that, for Llama-3.1, RL training for code generation in a source PL fails to improve, and sometimes even degrades, the performance on other target PLs. To address this, we hypothesize that effective RL transfer requires a generalizable SFT initialization before RL. We thus propose **Parallel-SFT**, an SFT strategy that incorporates "parallel programs" -- functionally equivalent code implemented in multiple PLs -- into the data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.