Breaking the Silence: the Threats of Using LLMs in Software Engineering

June Sallou; Thomas Durieux; Annibale Panichella

arXiv:2312.08055·cs.SE·January 9, 2024·5 cites

Breaking the Silence: the Threats of Using LLMs in Software Engineering

June Sallou, Thomas Durieux, Annibale Panichella

PDF

Open Access 1 Repo

TL;DR

This paper discusses the potential threats and validity concerns of using Large Language Models in Software Engineering, emphasizing the need for guidelines to ensure reliable research outcomes.

Contribution

It identifies key threats to LLM-based research validity in SE and proposes tailored guidelines for researchers and providers to mitigate these issues.

Findings

01

Identification of threats like data leakage and reproducibility issues

02

Illustration of guidelines through existing practices and a practical example

Abstract

Large Language Models (LLMs) have gained considerable traction within the Software Engineering (SE) community, impacting various SE tasks from code completion to test generation, from program repair to code summarization. Despite their promise, researchers must still be careful as numerous intricate factors can influence the outcomes of experiments involving LLMs. This paper initiates an open discussion on potential threats to the validity of LLM-based research including issues such as closed-source models, possible data leakage between LLM training data and research evaluation, and the reproducibility of LLM-based findings. In response, this paper proposes a set of guidelines tailored for SE researchers and Language Model (LM) providers to mitigate these concerns. The implications of the guidelines are illustrated using existing good practices followed by LLM providers and a practical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

llm4se/obfuscated-chatgpt-experiments
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Engineering Techniques and Practices

MethodsSparse Evolutionary Training