Piloting Copilot, Codex, and StarCoder2: Hot Temperature, Cold Prompts, or Black Magic?
Jean-Baptiste D\"oderlein, Nguessan Hermann Kouadio, Mathieu Acher, Djamel Eddine Khelladi, Benoit Combemale

TL;DR
This study investigates how input variations like prompts and parameters influence the performance of LLM-based code assistants, revealing significant performance improvements and complex interactions that affect their practical deployment.
Contribution
It systematically analyzes the impact of input modifications on code assistant effectiveness across multiple models and benchmarks, highlighting their potential and limitations.
Findings
Input variations can boost success rates up to 79.27%.
Optimal settings vary by problem and model.
Removing prompts can sometimes improve performance.
Abstract
Language models are promising solutions for tackling increasing complex problems. In software engineering, they recently gained attention in code assistants, which generate programs from a natural language task description (prompt). They have the potential to save time and effort but remain poorly understood, limiting their optimal use. In this article, we investigate the impact of input variations on two configurations of a language model, focusing on parameters such as task description, surrounding context, model creativity, and the number of generated solutions. We design specific operators to modify these inputs and apply them to three LLM-based code assistants (Copilot, Codex, StarCoder2) and two benchmarks representing algorithmic problems (HumanEval, LeetCode). Our study examines whether these variations significantly affect program quality and how these effects generalize across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Reinforcement Learning in Robotics · Software Engineering Techniques and Practices
