Improving Socratic Question Generation using Data Augmentation and   Preference Optimization

Nischal Ashok Kumar; Andrew Lan

arXiv:2403.00199·cs.CL·April 22, 2024·1 cites

Improving Socratic Question Generation using Data Augmentation and Preference Optimization

Nischal Ashok Kumar, Andrew Lan

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper enhances Socratic question generation by augmenting data with invalid questions and optimizing language models to prefer valid questions, leading to more accurate and relevant Socratic questioning in educational contexts.

Contribution

It introduces a novel data augmentation technique for invalid questions and applies preference optimization to improve LLMs for Socratic question generation.

Findings

01

DPO-optimized LLama 2 outperforms existing prompting methods.

02

The method effectively reduces invalid question generation.

03

Improves quality of Socratic questions for student code debugging.

Abstract

The Socratic method is a way of guiding students toward solving a problem independently without directly revealing the solution to the problem. Although this method has been shown to significantly improve student learning outcomes, it remains a complex labor-intensive task for instructors. Large language models (LLMs) can be used to augment human effort by automatically generating Socratic questions for students. However, existing methods that involve prompting these LLMs sometimes produce invalid outputs, e.g., those that directly reveal the solution to the problem or provide irrelevant or premature questions. To alleviate this problem, inspired by reinforcement learning with AI feedback (RLAIF), we first propose a data augmentation method to enrich existing Socratic questioning datasets with questions that are invalid in specific ways. Next, we propose a method to optimize open-source…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

umass-ml4ed/socratic-quest-gen
pytorchOfficial

Videos

Improving Socratic Question Generation using Data Augmentation and Preference Optimization· underline

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Educational Technology and Assessment · Educational Assessment and Pedagogy