A Preliminary Study of Multilingual Code Language Models for Code   Generation Task Using Translated Benchmarks

Rohit Dandamudi; Gema Rodr\'iguez-P\'erez

arXiv:2411.15470·cs.SE·November 26, 2024

A Preliminary Study of Multilingual Code Language Models for Code Generation Task Using Translated Benchmarks

Rohit Dandamudi, Gema Rodr\'iguez-P\'erez

PDF

TL;DR

This study evaluates multilingual code language models using translated benchmarks, revealing alignment with training metrics but also inconsistencies and reproducibility challenges, emphasizing the need for further empirical validation.

Contribution

It provides the first empirical assessment of translated benchmarks for multilingual code generation models, highlighting their potential and limitations.

Findings

01

Translated benchmarks align with training perplexity metrics.

02

Inconsistencies observed across different translated benchmarks.

03

Reproducibility challenges identified in performance evaluation.

Abstract

Evaluating the performance of Code Language Models (CLMs) for software engineering tasks, especially in multilingual and low-resource programming language settings, poses significant challenges. These challenges are primarily due to the lack of high-quality benchmarks across various programming languages and the imbalanced nature of the CLMs training corpus. Although recent advances in one of the common downstream tasks, code generation, have shown promise by introducing translated benchmarks using different methodologies, there is a current lack of empirical evidence assessing these benchmarks. To address this gap, we conducted a preliminary study to evaluate the performance of Poly-Coder, a pioneering open-source, multilingual CLM built for code generation. We utilized two existing state-of-the-art translations of the popular code generation benchmark, HumanEval, facilitated by the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsALIGN