Exploring Multi-Lingual Bias of Large Code Models in Code Generation
Chaozheng Wang, Zongjie Li, Cuiyun Gao, Wenxuan Wang, Ting Peng,, Hailiang Huang, Yuetang Deng, Shuai Wang, Michael R. Lyu

TL;DR
This paper investigates the multilingual bias in large code models, revealing significant performance disparities across languages and programming languages, and introduces a benchmark for systematic evaluation.
Contribution
The study constructs the first multilingual evaluation benchmark for code generation and provides large-scale analysis of bias in nine popular large code models.
Findings
Chinese instructions reduce performance by at least 13%.
Performance gap between Python and C++ reaches 20.9%.
Multilingual bias is prominent in current large code models.
Abstract
Code generation aims to synthesize code and fulfill functional requirements based on natural language (NL) specifications, which can greatly improve development efficiency. In the era of large language models (LLMs), large code models (LCMs) have been recently proposed to generate source code. LCMs can generate highly feasible solutions for programming problems described in natural language. Despite the effectiveness, we observe a noticeable multilingual bias in the generation performance of LCMs. Specifically, LCMs demonstrate proficiency in generating solutions when provided with instructions in English, yet may falter when faced with semantically equivalent instructions in other NLs such as Chinese. Moreover, the ability of LCMs to generate code exhibits variety across different programming languages (PLs), such as Python and C++. The observed phenomenon indicates the presence of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
