Pedagogy-driven Evaluation of Generative AI-powered Intelligent Tutoring Systems

Kaushal Kumar Maurya; Ekaterina Kochmar

arXiv:2510.22581·cs.CL·October 28, 2025

Pedagogy-driven Evaluation of Generative AI-powered Intelligent Tutoring Systems

Kaushal Kumar Maurya, Ekaterina Kochmar

PDF

TL;DR

This paper reviews current evaluation practices for AI-powered Intelligent Tutoring Systems, highlighting challenges and proposing research directions for standardized, pedagogically grounded assessment frameworks.

Contribution

It provides a comprehensive review of evaluation methods, identifies key challenges, and proposes new research directions based on learning science principles for ITS assessment.

Findings

01

Current evaluations rely on subjective and non-standardized benchmarks.

02

Challenges include lack of universally accepted, pedagogy-driven evaluation frameworks.

03

Proposes research directions for fair, scalable, and pedagogically grounded evaluation methods.

Abstract

The interdisciplinary research domain of Artificial Intelligence in Education (AIED) has a long history of developing Intelligent Tutoring Systems (ITSs) by integrating insights from technological advancements, educational theories, and cognitive psychology. The remarkable success of generative AI (GenAI) models has accelerated the development of large language model (LLM)-powered ITSs, which have potential to imitate human-like, pedagogically rich, and cognitively demanding tutoring. However, the progress and impact of these systems remain largely untraceable due to the absence of reliable, universally accepted, and pedagogy-driven evaluation frameworks and benchmarks. Most existing educational dialogue-based ITS evaluations rely on subjective protocols and non-standardized benchmarks, leading to inconsistencies and limited generalizability. In this work, we take a step back from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.