Book2Dial: Generating Teacher-Student Interactions from Textbooks for Cost-Effective Development of Educational Chatbots
Junling Wang, Jakub Macina, Nico Daheim, Sankalan Pal Chowdhury,, Mrinmaya Sachan

TL;DR
This paper introduces a framework for generating synthetic teacher-student dialogues from textbooks to facilitate cost-effective development of educational chatbots, highlighting methods, challenges, and insights for improving dialogue quality.
Contribution
The paper presents a novel approach to synthesize educational dialogues from textbooks using large language models, enabling scalable chatbot training with improved data generation techniques.
Findings
Synthetic dialogues improve chatbot training across domains
Fine-tuning enhances dialogue quality but still faces hallucination issues
Open-source data and code support future research
Abstract
Educational chatbots are a promising tool for assisting student learning. However, the development of effective chatbots in education has been challenging, as high-quality data is seldom available in this domain. In this paper, we propose a framework for generating synthetic teacher-student interactions grounded in a set of textbooks. Our approaches capture one aspect of learning interactions where curious students with partial knowledge interactively ask a teacher questions about the material in the textbook. We highlight various quality criteria that such dialogues should fulfill and compare several approaches relying on either prompting or fine-tuning large language models. We use synthetic dialogues to train educational chatbots and show benefits of further fine-tuning in different educational domains. However, human evaluation shows that our best data synthesis method still suffers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Online Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning
MethodsSparse Evolutionary Training
