Integrating Rules and Semantics for LLM-Based C-to-Rust Translation

Feng Luo; Kexing Ji; Cuiyun Gao; Shuzheng Gao; Jia Feng; Kui Liu; Xin Xia; Michael R. Lyu

arXiv:2508.06926·cs.SE·August 12, 2025

Integrating Rules and Semantics for LLM-Based C-to-Rust Translation

Feng Luo, Kexing Ji, Cuiyun Gao, Shuzheng Gao, Jia Feng, Kui Liu, Xin Xia, Michael R. Lyu

PDF

Open Access

TL;DR

This paper introduces IRENE, a novel LLM-based framework that combines rules and semantics to improve the accuracy and safety of translating legacy C code into Rust, addressing limitations of previous methods.

Contribution

IRENE integrates rule-based retrieval, structured summarization, and error-driven refinement to enhance C-to-Rust translation with LLMs, improving semantic consistency and rule adherence.

Findings

01

Improved translation accuracy on benchmark datasets

02

Enhanced safety by reducing unsafe code blocks

03

Effective handling of Rust rules and semantics

Abstract

Automated translation of legacy C code into Rust aims to ensure memory safety while reducing the burden of manual migration. Early approaches in code translation rely on static rule-based methods, but they suffer from limited coverage due to dependence on predefined rule patterns. Recent works regard the task as a sequence-to-sequence problem by leveraging large language models (LLMs). Although these LLM-based methods are capable of reducing unsafe code blocks, the translated code often exhibits issues in following Rust rules and maintaining semantic consistency. On one hand, existing methods adopt a direct prompting strategy to translate the C code, which struggles to accommodate the syntactic rules between C and Rust. On the other hand, this strategy makes it difficult for LLMs to accurately capture the semantics of complex code. To address these challenges, we propose IRENE, an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques