NL in the Middle: Code Translation with LLMs and Intermediate Representations
Chi-en Amy Tai, Pengyu Nie, Lukasz Golab, Alexander Wong

TL;DR
This paper explores how intermediate representations like NL summaries and ASTs can enhance LLM-based code translation, showing that chain-of-thought prompting with NL summaries significantly improves success rates.
Contribution
It demonstrates that integrating intermediate representations, especially NL summaries with chain-of-thought prompting, substantially boosts code translation accuracy using LLMs.
Findings
Chain-of-thought with NL summaries improves translation success by up to 13.8%.
Intermediate representations enhance LLM performance on code translation benchmarks.
NL summaries are more effective than other representations in guiding translation.
Abstract
Studies show that large language models (LLMs) produce buggy code translations. One promising avenue to improve translation accuracy is through intermediate representations, which provide structured guidance for the translation process. We investigate whether LLM-based code translation can benefit from intermediate representations, specifically in the form of natural language (NL) summaries and abstract syntax trees (ASTs). Since prompt engineering greatly affects LLM performance, we consider several ways to integrate these representations, from one-shot to chain-of-thought (CoT) prompting. Using Open GPT4 8X7B and specialized StarCoder and CodeGen models on popular code translation benchmarks (CodeNet and AVATAR), we find that CoT with an intermediate NL summary performs best, with an increase of 13.8% and 6.7%, respectively, in successful translations for the best-performing model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
