Exploring and Unleashing the Power of Large Language Models in Automated Code Translation
Zhen Yang, Fang Liu, Zhongxing Yu, Jacky Wai Keung, Jia Li, Shuo Liu,, Yifan Hong, Xiaoxue Ma, Zhi Jin, and Ge Li

TL;DR
This paper explores the use of large language models for automated code translation, identifying their strengths and limitations, and introduces UniTrans, a framework that enhances translation accuracy through test case generation and iterative repair.
Contribution
The paper proposes UniTrans, a novel framework that leverages LLMs and test case augmentation to improve code translation accuracy across multiple languages.
Findings
LLMs outperform traditional transpilers but face comprehension issues.
Test case-based augmentation significantly improves translation correctness.
UniTrans achieves substantial accuracy improvements across diverse datasets.
Abstract
Code translation tools (transpilers) are developed for automatic source-to-source translation. Although learning-based transpilers have shown impressive enhancement against rule-based counterparts, owing to their task-specific pre-training on extensive monolingual corpora. Their current performance still remains unsatisfactory for practical deployment, and the associated training resources are also prohibitively expensive. LLMs pre-trained on huge amounts of human-written code/text have shown remarkable performance in many code intelligence tasks due to their powerful generality, even without task-specific training. Thus, LLMs can potentially circumvent the above limitations, but they have not been exhaustively explored yet. This paper investigates diverse LLMs and learning-based transpilers for automated code translation tasks, finding that: although certain LLMs have outperformed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
