Safe Generative Chats in a WhatsApp Intelligent Tutoring System
Zachary Levonian, Owen Henkel

TL;DR
This paper discusses designing and evaluating safety measures for a WhatsApp-based intelligent tutoring system using large language models, highlighting empirical findings from over 8,000 student interactions.
Contribution
It introduces a safe conversational system for ITS, evaluates its safety through multiple methods, and provides insights into safeguarding student interactions with LLMs.
Findings
GPT-3.5 rarely generates inappropriate messages
Student messages are more often inappropriate, requiring moderation
Implications for design focus on student input moderation
Abstract
Large language models (LLMs) are flexible, personalizable, and available, which makes their use within Intelligent Tutoring Systems (ITSs) appealing. However, that flexibility creates risks: inaccuracies, harmful content, and non-curricular material. Ethically deploying LLM-backed ITS systems requires designing safeguards that ensure positive experiences for students. We describe the design of a conversational system integrated into an ITS, and our experience evaluating its safety with red-teaming, an in-classroom usability test, and field deployment. We present empirical data from more than 8,000 student conversations with this system, finding that GPT-3.5 rarely generates inappropriate messages. Comparatively more common is inappropriate messages from students, which prompts us to reason about safeguarding as a content moderation and classroom management problem. The student…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
