AITutor-EvalKit: Exploring the Capabilities of AI Tutors

Numaan Naeem; Kaushal Kumar Maurya; Kseniia Petukhova; Ekaterina Kochmar

arXiv:2512.03688·cs.CL·February 24, 2026

AITutor-EvalKit: Exploring the Capabilities of AI Tutors

Numaan Naeem, Kaushal Kumar Maurya, Kseniia Petukhova, Ekaterina Kochmar

PDF

Open Access

TL;DR

AITutor-EvalKit is a comprehensive tool that assesses AI tutors' pedagogical quality, offers demonstration and evaluation capabilities, and facilitates model inspection and data visualization for education stakeholders and researchers.

Contribution

This paper introduces AITutor-EvalKit, a novel application combining evaluation, visualization, and feedback collection for AI tutors in educational settings.

Findings

01

Supports pedagogical quality assessment of AI tutors

02

Enables model inspection and data visualization

03

Facilitates user feedback collection

Abstract

We present AITutor-EvalKit, an application that uses language technology to evaluate the pedagogical quality of AI tutors, provides software for demonstration and evaluation, as well as model inspection and data visualization. This tool is aimed at education stakeholders as well as *ACL community at large, as it supports learning and can also be used to collect user feedback and annotation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Teaching and Learning Programming · Explainable Artificial Intelligence (XAI)