CodePivot: Bootstrapping Multilingual Transpilation in LLMs via Reinforcement Learning without Parallel Corpora

Shangyu Li; Juyong Jiang; Meibo Ren; Sizhe Zhong; Huiri Tan; Yunhao Gou; Xu Han; Chun Yong Chong; Yun Peng; Jiasi Shen

arXiv:2604.18027·cs.SE·April 21, 2026

CodePivot: Bootstrapping Multilingual Transpilation in LLMs via Reinforcement Learning without Parallel Corpora

Shangyu Li, Juyong Jiang, Meibo Ren, Sizhe Zhong, Huiri Tan, Yunhao Gou, Xu Han, Chun Yong Chong, Yun Peng, Jiasi Shen

PDF

1 Repo

TL;DR

CodePivot introduces a novel reinforcement learning framework using Python as an intermediate representation to enable multilingual code transpilation without parallel data, outperforming larger models on multiple tasks.

Contribution

It presents a new RL-based training method leveraging Python IR and a unique reward mechanism to improve multilingual code transpilation, especially for low-resource languages.

Findings

01

Outperforms larger models on Python-to-Other tasks.

02

Improves low-resource language transpilation performance.

03

Does not require parallel corpora for training.

Abstract

Transpilation, or code translation, aims to convert source code from one programming language (PL) to another. It is beneficial for many downstream applications, from modernizing large legacy codebases to augmenting data for low-resource PLs. Recent large language model (LLM)-based approaches have demonstrated immense potential for code translation. Among these approaches, training-based methods are particularly important because LLMs currently do not effectively adapt to domain-specific settings that suffer from a lack of knowledge without targeted training. This limitation is evident in transpilation tasks involving low-resource PLs. However, existing training-based approaches rely on a pairwise transpilation paradigm, making it impractical to support a diverse range of PLs. This limitation is particularly prominent for low-resource PLs due to a scarcity of training data. Furthermore,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lishangyu-hkust/CodePivot
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.