Can AI be a Teaching Partner? Evaluating ChatGPT, Gemini, and DeepSeek across Three Teaching Strategies
Talita de Paula Cypriano de Souza, Shruti Mehta, Matheus Arataque Uema, Luciano Bernardes de Paula, Seiji Isotani

TL;DR
This study compares ChatGPT, Gemini, and DeepSeek as teaching agents across three pedagogical strategies, providing empirical evidence on their effectiveness in teaching programming to beginners.
Contribution
It introduces an evaluation protocol for assessing LLMs as pedagogical tools and compares their performance across different teaching strategies.
Findings
ChatGPT and Gemini scored higher than DeepSeek in pedagogical effectiveness.
Models showed similar patterns in Examples, Explanations, and Analogies strategies.
Performance varied with the Socratic Method, influenced by initial prompts.
Abstract
There are growing promises that Large Language Models (LLMs) can support students' learning by providing explanations, feedback, and guidance. However, despite their rapid adoption and widespread attention, there is still limited empirical evidence regarding the pedagogical skills of LLMs. This article presents a comparative study of popular LLMs, namely, ChatGPT, DeepSeek, and Gemini, acting as teaching agents. An evaluation protocol was developed, focusing on three pedagogical strategies: Examples, Explanations and Analogies, and the Socratic Method. Six human judges conducted the evaluations in the context of teaching the C programming language to beginners. The results indicate that LLM models exhibited similar interaction patterns in the pedagogical strategies of Examples and Explanations and Analogies. In contrast, for the Socratic Method, the models showed greater sensitivity to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
