What Makes Good In-context Demonstrations for Code Intelligence Tasks   with LLMs?

Shuzheng Gao; Xin-Cheng Wen; Cuiyun Gao; Wenxuan Wang; Hongyu Zhang,; Michael R. Lyu

arXiv:2304.07575·cs.SE·January 11, 2024·5 cites

What Makes Good In-context Demonstrations for Code Intelligence Tasks with LLMs?

Shuzheng Gao, Xin-Cheng Wen, Cuiyun Gao, Wenxuan Wang, Hongyu Zhang,, Michael R. Lyu

PDF

Open Access 1 Repo

TL;DR

This paper investigates how the selection, order, and number of in-context demonstrations affect the performance of large language models in code intelligence tasks, providing guidelines for constructing effective demonstrations.

Contribution

It systematically studies the impact of demonstration construction factors on ICL performance in code tasks and offers practical recommendations for improvement.

Findings

01

All three factors significantly influence ICL performance.

02

Carefully-designed demonstrations can substantially outperform standard methods.

03

Significant improvements in BLEU-4, EM, and EM scores across tasks.

Abstract

Pre-trained models of source code have gained widespread popularity in many code intelligence tasks. Recently, with the scaling of the model and corpus size, large language models have shown the ability of in-context learning (ICL). ICL employs task instructions and a few examples as demonstrations, and then inputs the demonstrations to the language models for making predictions. This new learning paradigm is training-free and has shown impressive performance in various natural language processing and code intelligence tasks. However, the performance of ICL heavily relies on the quality of demonstrations, e.g., the selected examples. It is important to systematically investigate how to construct a good demonstration for code-related tasks. In this paper, we empirically explore the impact of three key factors on the performance of ICL in code intelligence tasks: the selection, order, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shuzhenggao/icl4code
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software System Performance and Reliability