How Far Are We? The Triumphs and Trials of Generative AI in Learning Software Engineering
Rudrajit Choudhuri, Dylan Liu, Igor Steinmacher, Marco Gerosa, Anita, Sarma

TL;DR
This study evaluates ChatGPT's effectiveness in supporting software engineering students, finding no productivity gains but increased frustration and identifying interaction faults that impact user experience.
Contribution
It provides empirical insights into the current capabilities and challenges of conversational AI in software engineering education, highlighting interaction issues and user frustration.
Findings
No significant difference in productivity or self-efficacy
Increased frustration levels among students
Identification of five interaction faults causing negative outcomes
Abstract
Conversational Generative AI (convo-genAI) is revolutionizing Software Engineering (SE) as engineers and academics embrace this technology in their work. However, there is a gap in understanding the current potential and pitfalls of this technology, specifically in supporting students in SE tasks. In this work, we evaluate through a between-subjects study (N=22) the effectiveness of ChatGPT, a convo-genAI platform, in assisting students in SE tasks. Our study did not find statistical differences in participants' productivity or self-efficacy when using ChatGPT as compared to traditional resources, but we found significantly increased frustration levels. Our study also revealed 5 distinct faults arising from violations of Human-AI interaction guidelines, which led to 7 different (negative) consequences on participants.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Artificial Intelligence in Healthcare and Education · Software Engineering Research
