Parallel-SFT: Improving Zero-Shot Cross-Programming-Language Transfer for Code RL
Zhaofeng Wu, Shiqi Wang, Boya Peng, Anuj Goyal, Melanie Kambadur, Sebastian Ruder, Yoon Kim, Chloe Bi

TL;DR
This paper introduces Parallel-SFT, a strategy that uses parallel programs across multiple languages to enhance zero-shot transfer in code RL, leading to better generalization across programming languages.
Contribution
Proposes Parallel-SFT, a novel SFT approach utilizing parallel programs to improve cross-language transferability in code RL models.
Findings
Parallel-SFT improves transferability to unseen programming languages.
RL training on Parallel-SFT models enhances generalization performance.
Latent space analysis shows more functionality-centric representations with Parallel-SFT.
Abstract
Modern language models demonstrate impressive coding capabilities in common programming languages (PLs), such as C++ and Python, but their performance in lower-resource PLs is often limited by training data availability. In principle, however, most programming skills are universal across PLs, so the capability acquired in one PL should transfer to others. In this work, we propose the task of zero-shot cross-programming-language transfer for code RL. We find that, for Llama-3.1, RL training for code generation in a source PL fails to improve, and sometimes even degrades, the performance on other target PLs. To address this, we hypothesize that effective RL transfer requires a generalizable SFT initialization before RL. We thus propose **Parallel-SFT**, an SFT strategy that incorporates "parallel programs" -- functionally equivalent code implemented in multiple PLs -- into the data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
