LLM-Based Test-Driven Interactive Code Generation: User Study and   Empirical Evaluation

Sarah Fakhoury; Aaditya Naik; Georgios Sakkas; Saikat Chakraborty,; Shuvendu K. Lahiri

arXiv:2404.10100·cs.SE·October 4, 2024·2 cites

LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation

Sarah Fakhoury, Aaditya Naik, Georgios Sakkas, Saikat Chakraborty,, Shuvendu K. Lahiri

PDF

Open Access

TL;DR

This paper introduces TiCoder, an interactive workflow that uses test-driven clarification to improve the accuracy of code generated by large language models, demonstrated through user studies and empirical tests.

Contribution

The paper presents a novel test-driven interactive workflow, TiCoder, that enhances LLM-based code generation accuracy and user understanding through intent clarification and automated testing.

Findings

01

Participants using TiCoder more accurately evaluated generated code.

02

TiCoder significantly reduced task-induced cognitive load.

03

Achieved an average 45.97% improvement in code accuracy across models and datasets.

Abstract

Large language models (LLMs) have shown great potential in automating significant aspects of coding by producing natural code from informal natural language (NL) intent. However, given NL is informal, it does not lend easily to checking that the generated code correctly satisfies the user intent. In this paper, we propose a novel interactive workflow TiCoder for guided intent clarification (i.e., partial formalization) through tests to support the generation of more accurate code suggestions. Through a mixed methods user study with 15 programmers, we present an empirical evaluation of the effectiveness of the workflow to improve code generation accuracy. We find that participants using the proposed workflow are significantly more likely to correctly evaluate AI generated code, and report significantly less task-induced cognitive load. Furthermore, we test the potential of the workflow…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel-Driven Software Engineering Techniques · Software Testing and Debugging Techniques · Software Engineering Research