Assessing the Promise and Pitfalls of ChatGPT for Automated Code   Generation

Muhammad Fawad Akbar Khan; Max Ramsdell; Erik Falor; Hamid Karimi

arXiv:2311.02640·cs.SE·November 7, 2023·2 cites

Assessing the Promise and Pitfalls of ChatGPT for Automated Code Generation

Muhammad Fawad Akbar Khan, Max Ramsdell, Erik Falor, Hamid Karimi

PDF

Open Access 1 Repo

TL;DR

This study thoroughly evaluates ChatGPT's ability to generate code, comparing it to human programmers across various metrics, revealing strengths in efficiency and data tasks but limitations in visual challenges.

Contribution

It introduces a novel dataset and evaluation methodology for assessing ChatGPT's code generation, providing insights into its capabilities and limitations with quantitative and qualitative analysis.

Findings

01

ChatGPT achieves 93.1% accuracy in data analysis tasks

02

It excels in concise, efficient, modular code with good error handling

03

Machine learning models can distinguish ChatGPT code from human code with 88% accuracy

Abstract

This paper presents a comprehensive evaluation of the code generation capabilities of ChatGPT, a prominent large language model, compared to human programmers. A novel dataset of 131 code-generation prompts across 5 categories was curated to enable robust analysis. Code solutions were generated by both ChatGPT and humans for all prompts, resulting in 262 code samples. A meticulous manual assessment methodology prioritized evaluating correctness, comprehensibility, and security using 14 established code quality metrics. The key findings reveal ChatGPT's strengths in crafting concise, efficient code with advanced constructs, showcasing strengths in data analysis tasks (93.1% accuracy) but limitations in visual-graphical challenges. Comparative analysis with human code highlights ChatGPT's inclination towards modular design and superior error handling. Additionally, machine learning models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dsaatusu/chatgpt-promises-and-pitfalls
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Software Engineering Research