CSEPrompts: A Benchmark of Introductory Computer Science Prompts
Nishat Raihan, Dhiman Goswami, Sadiya Sayara Chowdhury Puspo,, Christian Newman, Tharindu Ranasinghe, Marcos Zampieri

TL;DR
This paper introduces CSEPrompts, a comprehensive benchmark of programming exercises and questions from introductory CS courses, to evaluate LLMs' capabilities in generating code and answering related questions.
Contribution
It provides a new benchmark dataset for assessing LLM performance on introductory CS prompts and offers experimental analysis of various models' effectiveness.
Findings
LLMs show varying proficiency in generating Python code.
Performance differences observed among different LLMs.
CSEPrompts serves as a valuable tool for future research in AI-assisted CS education.
Abstract
Recent advances in AI, machine learning, and NLP have led to the development of a new generation of Large Language Models (LLMs) that are trained on massive amounts of data and often have trillions of parameters. Commercial applications (e.g., ChatGPT) have made this technology available to the general public, thus making it possible to use LLMs to produce high-quality texts for academic and professional purposes. Schools and universities are aware of the increasing use of AI-generated content by students and they have been researching the impact of this new technology and its potential misuse. Educational programs in Computer Science (CS) and related fields are particularly affected because LLMs are also capable of generating programming code in various programming languages. To help understand the potential impact of publicly available LLMs in CS education, we introduce CSEPrompts, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTeaching and Learning Programming · Online Learning and Analytics · Experimental Learning in Engineering
MethodsAttentive Walk-Aggregating Graph Neural Network
