CSEPrompts: A Benchmark of Introductory Computer Science Prompts

Nishat Raihan; Dhiman Goswami; Sadiya Sayara Chowdhury Puspo,; Christian Newman; Tharindu Ranasinghe; Marcos Zampieri

arXiv:2404.02540·cs.CL·April 5, 2024·1 cites

CSEPrompts: A Benchmark of Introductory Computer Science Prompts

Nishat Raihan, Dhiman Goswami, Sadiya Sayara Chowdhury Puspo,, Christian Newman, Tharindu Ranasinghe, Marcos Zampieri

PDF

Open Access 1 Repo

TL;DR

This paper introduces CSEPrompts, a comprehensive benchmark of programming exercises and questions from introductory CS courses, to evaluate LLMs' capabilities in generating code and answering related questions.

Contribution

It provides a new benchmark dataset for assessing LLM performance on introductory CS prompts and offers experimental analysis of various models' effectiveness.

Findings

01

LLMs show varying proficiency in generating Python code.

02

Performance differences observed among different LLMs.

03

CSEPrompts serves as a valuable tool for future research in AI-assisted CS education.

Abstract

Recent advances in AI, machine learning, and NLP have led to the development of a new generation of Large Language Models (LLMs) that are trained on massive amounts of data and often have trillions of parameters. Commercial applications (e.g., ChatGPT) have made this technology available to the general public, thus making it possible to use LLMs to produce high-quality texts for academic and professional purposes. Schools and universities are aware of the increasing use of AI-generated content by students and they have been researching the impact of this new technology and its potential misuse. Educational programs in Computer Science (CS) and related fields are particularly affected because LLMs are also capable of generating programming code in various programming languages. To help understand the potential impact of publicly available LLMs in CS education, we introduce CSEPrompts, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mraihan-gmu/cseprompts
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTeaching and Learning Programming · Online Learning and Analytics · Experimental Learning in Engineering

MethodsAttentive Walk-Aggregating Graph Neural Network