SpecTra: Enhancing the Code Translation Ability of Language Models by Generating Multi-Modal Specifications
Vikram Nitin, Rahul Krishna, Baishakhi Ray

TL;DR
SpecTra enhances code translation by generating and utilizing multi-modal specifications like test cases and descriptions, significantly improving LLM performance across multiple programming language pairs.
Contribution
Introduces SpecTra, a multi-stage method that uses self-consistency filtering to generate specifications, boosting code translation accuracy of LLMs.
Findings
Up to 46% relative improvement in translation quality.
Effective across C to Rust, C to Go, JavaScript to TypeScript.
Demonstrated scalability to full project translation.
Abstract
Large language models (LLMs) are increasingly being used for the task of automated code translation, which has important real-world applications. However, most existing approaches use only the source code of a program as an input to an LLM, and do not consider the different kinds of specifications that can be extracted from a program. In this paper, we propose SpecTra, a multi-stage approach that uses a novel self-consistency filter to first generate high-quality static specifications, test cases, and natural language descriptions from a given program, and then uses these along with the source code to improve the quality of LLM-generated translations. We evaluate SpecTra on three code translation tasks - C to Rust, C to Go, and JavaScript to TypeScript - and show that it can enhance the performance of six popular LLMs on these tasks by up to a relative improvement of 46%. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Model-Driven Software Engineering Techniques
