CodeScore-R: An Automated Robustness Metric for Assessing the FunctionalCorrectness of Code Synthesis
Guang Yang, Yu Zhou, Xiang Chen, and Xiangyu Zhang

TL;DR
CodeScore-R is an automated, robust evaluation metric for code synthesis that accurately assesses functional correctness without test cases, using contrastive learning and transformations to ensure stability against minor code changes.
Contribution
The paper introduces CodeScore-R, a novel automated metric combining contrastive learning and transformations to evaluate code functionality more robustly and efficiently.
Findings
Outperforms existing metrics in code generation and migration tasks
More closely aligned with Pass@k metric
Demonstrates stronger robustness against code variations
Abstract
Evaluation metrics are crucial in the field of code synthesis. Commonly used code evaluation metrics canbe classified into three types: match-based, semantic-based, and execution-based. Among them, the execution-basedPass@k metric accurately assesses the functionality of predicted code by executing test cases. However, calculatingthis metric requires a significant amount of overhead, necessitating the design of an automated evaluation metric thatcan assess the functionality of predicted code without the need for test cases. Additionally, a good evaluation metricshould be robust, that is the metric can maintain its accuracy even when the predicted code undergoes minor changes.To address these challenges, we propose an automated robust metric, called CodeScore-R, based on UniXcoder andcontrastive learning, for evaluating the functionality of code synthesis. CodeScore-R employs techniques…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Advanced Software Engineering Methodologies
