Large Language Models for Multilingual Code Intelligence: A Survey
Chao Jiang, Dugang Liu, Cheng Wen, Zhiwu Xu, Hua Zheng, Muhammad Sadiq, Jawwad Ahmed Shamsi, Shengchao Qin, and Zhong Ming

TL;DR
This survey reviews the use of large language models in multilingual code intelligence, emphasizing challenges in less-resourced languages and the importance of semantic preservation across languages.
Contribution
It provides a comprehensive overview of methods, benchmarks, and challenges in multilingual code generation and translation using large language models.
Findings
Current models perform well on high-resource languages like Python.
Multilingual code translation aims to preserve semantics across languages.
Challenges include bias toward high-resource languages and ensuring trustworthy generalization.
Abstract
Large language models have transformed AI-assisted software engineering, but current research remains biased toward high-resource languages such as Python, with weaker performance in languages like Rust and OCaml. Since real-world systems are inherently polyglot, robust multilingual code intelligence is crucial. This survey focuses on two key tasks: multilingual code generation from shared natural-language requirements, and multilingual code translation that preserves semantics across languages. It reviews representative methods, benchmarks, and evaluation metrics, and highlights challenges and opportunities for trustworthy cross-language generalization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
