Unmasking the giant: A comprehensive evaluation of ChatGPT's proficiency   in coding algorithms and data structures

Sayed Erfan Arefin; Tasnia Ashrafi Heya; Hasan Al-Qudah; Ynes Ineza,; Abdul Serwadda

arXiv:2307.05360·cs.SE·May 28, 2024·1 cites

Unmasking the giant: A comprehensive evaluation of ChatGPT's proficiency in coding algorithms and data structures

Sayed Erfan Arefin, Tasnia Ashrafi Heya, Hasan Al-Qudah, Ynes Ineza,, Abdul Serwadda

PDF

Open Access

TL;DR

This paper thoroughly evaluates ChatGPT's coding skills in data structures and algorithms, analyzing correctness, code quality, error patterns, and potential memorization, comparing GPT-3.5 and GPT-4 with human performance.

Contribution

It provides the largest systematic assessment of ChatGPT's algorithmic coding abilities, revealing strengths, limitations, and insights into its learning and memorization behaviors.

Findings

01

ChatGPT shows high correctness in simple problems but struggles with complex ones.

02

GPT-4 outperforms GPT-3.5 in accuracy and code quality.

03

Evidence suggests some level of memorization in ChatGPT's responses.

Abstract

The transformative influence of Large Language Models (LLMs) is profoundly reshaping the Artificial Intelligence (AI) technology domain. Notably, ChatGPT distinguishes itself within these models, demonstrating remarkable performance in multi-turn conversations and exhibiting code proficiency across an array of languages. In this paper, we carry out a comprehensive evaluation of ChatGPT's coding capabilities based on what is to date the largest catalog of coding challenges. Our focus is on the python programming language and problems centered on data structures and algorithms, two topics at the very foundations of Computer Science. We evaluate ChatGPT for its ability to generate correct solutions to the problems fed to it, its code quality, and nature of run-time errors thrown by its code. Where ChatGPT code successfully executes, but fails to solve the problem at hand, we look into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · COVID-19 diagnosis using AI · Explainable Artificial Intelligence (XAI)

MethodsFocus