The BEA 2023 Shared Task on Generating AI Teacher Responses in Educational Dialogues
Ana\"is Tack, Ekaterina Kochmar, Zheng Yuan, Serge Bibauw, Chris Piech

TL;DR
This paper reports on the first shared task evaluating AI models' ability to generate teacher responses in educational dialogues, comparing various models and highlighting evaluation challenges.
Contribution
It introduces a benchmark for AI teacher response generation in educational dialogues and evaluates multiple state-of-the-art models using both automated and human assessments.
Findings
NAISTeacher with GPT-3.5 achieved top scores
Automated metrics showed limitations in educational context evaluation
Ensemble prompt strategies improved response quality
Abstract
This paper describes the results of the first shared task on the generation of teacher responses in educational dialogues. The goal of the task was to benchmark the ability of generative language models to act as AI teachers, replying to a student in a teacher-student dialogue. Eight teams participated in the competition hosted on CodaLab. They experimented with a wide variety of state-of-the-art models, including Alpaca, Bloom, DialoGPT, DistilGPT-2, Flan-T5, GPT-2, GPT-3, GPT- 4, LLaMA, OPT-2.7B, and T5-base. Their submissions were automatically scored using BERTScore and DialogRPT metrics, and the top three among them were further manually evaluated in terms of pedagogical ability based on Tack and Piech (2022). The NAISTeacher system, which ranked first in both automated and human evaluation, generated responses with GPT-3.5 using an ensemble of prompts and a DialogRPT-based ranking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Intelligent Tutoring Systems and Adaptive Learning
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Discriminative Fine-Tuning · Cosine Annealing · 15 Ways to Contact How can i speak to someone at Delta Airlines · Layer Normalization · Weight Decay
