To Tag, or Not to Tag: Translating C's Unions to Rust's Tagged Unions
Jaemin Hong, Sukyoung Ryu

TL;DR
This paper presents a static analysis technique to automatically replace C unions with Rust's tagged unions during translation, improving safety and correctness in migrated code.
Contribution
It introduces a novel static analysis method to identify and replace C unions with Rust's tagged unions, addressing a key unsafe feature in C-to-Rust translation.
Findings
Identified 74 tag fields with no false positives
Achieved correct transformations in 17 out of 23 programs
Analyzed 141k LOC in under 5,000 seconds
Abstract
Automatic C-to-Rust translation is a promising way to enhance the reliability of legacy system software. However, C2Rust, an industrially developed translator, generates Rust code with unsafe features, undermining the translation's objective. While researchers have proposed techniques to remove unsafe features in C2Rust-generated code, these efforts have targeted only a limited subset of unsafe features. One important unsafe feature remaining unaddressed is a union, a type consisting of multiple fields sharing the same memory storage. Programmers often place a union with a tag in a struct to record the last-written field, but they can still access wrong fields. In contrast, Rust's tagged unions combine tags and unions at the language level, ensuring correct value access. In this work, we propose techniques to replace unions with tagged unions during C-to-Rust translation. We develop a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTranslation Studies and Practices
