Can NMT Understand Me? Towards Perturbation-based Evaluation of NMT Models for Code Generation
Pietro Liguori, Cristina Improta, Simona De Vivo, Roberto Natella,, Bojan Cukic, Domenico Cotroneo

TL;DR
This paper proposes a perturbation-based evaluation framework to assess the robustness of neural machine translation models specifically for code generation, highlighting vulnerabilities and guiding future improvements.
Contribution
It introduces a novel set of perturbations and metrics tailored for evaluating NMT models in code generation tasks, filling a gap in robustness validation methods.
Findings
Certain perturbations significantly degrade model performance
Insights into model vulnerabilities to specific input changes
Preliminary results guide future robustness enhancements
Abstract
Neural Machine Translation (NMT) has reached a level of maturity to be recognized as the premier method for the translation between different languages and aroused interest in different research areas, including software engineering. A key step to validate the robustness of the NMT models consists in evaluating the performance of the models on adversarial inputs, i.e., inputs obtained from the original ones by adding small amounts of perturbation. However, when dealing with the specific task of the code generation (i.e., the generation of code starting from a description in natural language), it has not yet been defined an approach to validate the robustness of the NMT models. In this work, we address the problem by identifying a set of perturbations and metrics tailored for the robustness assessment of such models. We present a preliminary experimental evaluation, showing what type of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
