SpecTra: Enhancing the Code Translation Ability of Language Models by Generating Multi-Modal Specifications

Vikram Nitin; Rahul Krishna; Baishakhi Ray

arXiv:2405.18574·cs.SE·December 8, 2025·2 cites

SpecTra: Enhancing the Code Translation Ability of Language Models by Generating Multi-Modal Specifications

Vikram Nitin, Rahul Krishna, Baishakhi Ray

PDF

Open Access

TL;DR

SpecTra enhances code translation by generating and utilizing multi-modal specifications like test cases and descriptions, significantly improving LLM performance across multiple programming language pairs.

Contribution

Introduces SpecTra, a multi-stage method that uses self-consistency filtering to generate specifications, boosting code translation accuracy of LLMs.

Findings

01

Up to 46% relative improvement in translation quality.

02

Effective across C to Rust, C to Go, JavaScript to TypeScript.

03

Demonstrated scalability to full project translation.

Abstract

Large language models (LLMs) are increasingly being used for the task of automated code translation, which has important real-world applications. However, most existing approaches use only the source code of a program as an input to an LLM, and do not consider the different kinds of specifications that can be extracted from a program. In this paper, we propose SpecTra, a multi-stage approach that uses a novel self-consistency filter to first generate high-quality static specifications, test cases, and natural language descriptions from a given program, and then uses these along with the source code to improve the quality of LLM-generated translations. We evaluate SpecTra on three code translation tasks - C to Rust, C to Go, and JavaScript to TypeScript - and show that it can enhance the performance of six popular LLMs on these tasks by up to a relative improvement of 46%. We also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Model-Driven Software Engineering Techniques