Does Few-Shot Learning Help LLM Performance in Code Synthesis?

Derek Xu; Tong Xie; Botao Xia; Haoyu Li; Yunsheng Bai; Yizhou Sun; Wei; Wang

arXiv:2412.02906·cs.SE·December 5, 2024

Does Few-Shot Learning Help LLM Performance in Code Synthesis?

Derek Xu, Tong Xie, Botao Xia, Haoyu Li, Yunsheng Bai, Yizhou Sun, Wei, Wang

PDF

Open Access

TL;DR

This paper systematically investigates the impact of few-shot examples in prompts on large language models' code synthesis performance, proposing methods for selecting impactful examples to enhance code generation accuracy.

Contribution

It introduces two novel approaches for selecting few-shot examples, one model-free and one model-based, improving LLM code synthesis performance.

Findings

01

Both methods significantly improve CodeLlama's performance on HumanEval+

02

The approaches offer a trade-off between performance gains and interpretability

03

Systematic analysis reveals which few-shot examples are most impactful

Abstract

Large language models (LLMs) have made significant strides at code generation through improved model design, training, and chain-of-thought. However, prompt-level optimizations remain an important yet under-explored aspect of LLMs for coding. This work focuses on the few-shot examples present in most code generation prompts, offering a systematic study on whether few-shot examples improve LLM's coding capabilities, which few-shot examples have the largest impact, and how to select impactful examples. Our work offers 2 approaches for selecting few-shot examples, a model-free method, CODEEXEMPLAR-FREE, and a model-based method, CODEEXEMPLAR-BASED. The 2 methods offer a trade-off between improved performance and reliance on training data and interpretability. Both methods significantly improve CodeLlama's coding ability across the popular HumanEval+ coding benchmark. In summary, our work…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques