A test-free semantic mistakes localization framework in Neural Code Translation
Lei Chen, Sai Zhang, Fangzhou Xu, Zhenchang Xing, Liang Wan, Xiaowang, Zhang, and Zhiyong Feng

TL;DR
EISP is a static analysis framework that uses large language models to locate semantic errors in translated code without relying on test cases, improving accuracy over existing methods.
Contribution
The paper introduces EISP, the first static analysis tool that detects semantic errors in code translation without test cases, leveraging LLMs and semantic mapping.
Findings
Achieved 82.3% accuracy in locating semantic errors.
Outperformed baseline methods by 20.3%.
Surpassed dynamic analysis methods by 7.4%.
Abstract
In the task of code translation, neural network-based models have been shown to frequently produce semantically erroneous code that deviates from the original logic of the source code. This issue persists even with advanced large models. Although a recent approach proposed using test cases to identify these semantic errors, it relies heavily on the quality of the test cases and is not applicable to code snippets without test cases in real-world scenarios. Therefore, We present EISP, a static analysis framework based on the Large Language Model (LLM).First, the framework generates a semantic mapping between source code and translated code. Next, each sub-code fragment is identified by recursively traversing the abstract syntax tree of the source code, and its corresponding translated code fragment is found through the semantic mapping. Finally, EISP connects each pair of sub-code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
