Evaluating the Effectiveness of Large Language Models in Solving Simple Programming Tasks: A User-Centered Study
Kai Deng

TL;DR
This study examines how different interaction styles of ChatGPT-4o influence high school students' performance and satisfaction in simple programming tasks, emphasizing the importance of user-centered AI design.
Contribution
It introduces a user-centered evaluation of ChatGPT-4o's interaction styles, demonstrating the impact of collaborative engagement on learning outcomes.
Findings
Collaborative style improves task completion time
Participants report higher satisfaction with collaborative interaction
Interaction style significantly affects user performance and perception
Abstract
As large language models (LLMs) become more common in educational tools and programming environments, questions arise about how these systems should interact with users. This study investigates how different interaction styles with ChatGPT-4o (passive, proactive, and collaborative) affect user performance on simple programming tasks. I conducted a within-subjects experiment where fifteen high school students participated, completing three problems under three distinct versions of the model. Each version was designed to represent a specific style of AI support: responding only when asked, offering suggestions automatically, or engaging the user in back-and-forth dialogue.Quantitative analysis revealed that the collaborative interaction style significantly improved task completion time compared to the passive and proactive conditions. Participants also reported higher satisfaction and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
