SafeTutors: Benchmarking Pedagogical Safety in AI Tutoring Systems
Rima Hazra, Bikram Ghuku, Ilona Marchenko, Yaroslava Tokarieva, Sayan Layek, Somnath Banerjee, Julia Stoyanovich, Mykola Pechenizkiy

TL;DR
This paper introduces SafeTutors, a benchmark for evaluating the safety and pedagogical effectiveness of AI tutoring systems across multiple subjects, highlighting common risks and the need for discipline-aware mitigation strategies.
Contribution
The paper presents SafeTutors, a comprehensive benchmark with a risk taxonomy to systematically assess safety and pedagogy in AI tutors across science subjects.
Findings
All models exhibit broad harm across subjects.
Scaling does not reliably reduce harms.
Multi-turn dialogues significantly increase pedagogical failures.
Abstract
Large language models are rapidly being deployed as AI tutors, yet current evaluation paradigms assess problem-solving accuracy and generic safety in isolation, failing to capture whether a model is simultaneously pedagogically effective and safe across student-tutor interaction. We argue that tutoring safety is fundamentally different from conventional LLM safety: the primary risk is not toxic content but the quiet erosion of learning through answer over-disclosure, misconception reinforcement, and the abdication of scaffolding. To systematically study this failure mode, we introduce SafeTutors, a benchmark that jointly evaluates safety and pedagogy across mathematics, physics, and chemistry. SafeTutors is organized around a theoretically grounded risk taxonomy comprising 11 harm dimensions and 48 sub-risks drawn from learning-science literature. We uncover that all models show broad…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Topic Modeling · Explainable Artificial Intelligence (XAI)
