Condor: A Code Discriminator Integrating General Semantics with Code Details
Qingyuan Liang, Zhao Zhang, Chen Liu, Zeyu Sun, Wenjie Zhang, Yizhou Chen, Zixiao Zhao, Qi Luo, Wentao Wang, Yanjie Jiang, Yingfei Xiong, Lu Zhang

TL;DR
Condor is a novel discriminator that combines general semantics with code details, using contrastive learning and intermediate data to improve the reliability of LLM-generated code outputs, especially in subtle difference detection.
Contribution
This paper introduces Condor, a discriminator that integrates general semantics with code details through contrastive learning and intermediate data, enhancing subtle difference detection in code outputs.
Findings
Condor significantly outperforms other discriminators on CodeNanoFix dataset.
Condor improves Pass@1 scores of LLMs on multiple datasets, including a 147.05% increase on APPS.
Demonstrates strong generalization across various code datasets.
Abstract
LLMs demonstrate significant potential across various software engineering tasks. However, they still face challenges in generating correct code on the first attempt when addressing complex requirements. Introducing a discriminator to select reliable outputs from multiple generated results is an effective way to enhance their reliability and stability. Currently, these discriminators fall into two categories: execution-based discriminators and non-execution-based discriminators. Execution-based discriminators face flexibility challenges due to difficulties in obtaining test cases and security concerns, while non-execution-based discriminators, although more flexible, struggle to capture subtle differences in code details. To maintain flexibility while improving the model's ability to capture fine-grained code details, this paper proposes Condor. We first design contrastive learning to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Software Engineering Research
