Kattis vs. ChatGPT: Assessment and Evaluation of Programming Tasks in the Age of Artificial Intelligence
Nora Dunder, Saga Lundborg, Olga Viberg, Jacqueline Wong

TL;DR
This study evaluates ChatGPT's ability to solve programming problems of varying difficulty, revealing its strengths in simple tasks and limitations with complex ones, thus informing AI's role in computer science education.
Contribution
It provides an empirical assessment of ChatGPT's performance on real-world programming problems from Kattis, highlighting its capabilities and limitations in educational contexts.
Findings
ChatGPT solved 19 out of 127 problems independently.
It performs well on simple programming tasks.
It struggles with more complex problems.
Abstract
AI-powered education technologies can support students and teachers in computer science education. However, with the recent developments in generative AI, and especially the increasingly emerging popularity of ChatGPT, the effectiveness of using large language models for solving programming tasks has been underexplored. The present study examines ChatGPT's ability to generate code solutions at different difficulty levels for introductory programming courses. We conducted an experiment where ChatGPT was tested on 127 randomly selected programming problems provided by Kattis, an automatic software grading tool for computer science programs, often used in higher education. The results showed that ChatGPT independently could solve 19 out of 127 programming tasks generated and assessed by Kattis. Further, ChatGPT was found to be able to generate accurate code solutions for simple problems…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Software Engineering Research · Software Engineering Techniques and Practices
