Can Language Models Employ the Socratic Method? Experiments with Code   Debugging

Erfan Al-Hossami; Razvan Bunescu; Justin Smith; Ryan Teehan

arXiv:2310.03210·cs.CL·October 6, 2023·1 cites

Can Language Models Employ the Socratic Method? Experiments with Code Debugging

Erfan Al-Hossami, Razvan Bunescu, Justin Smith, Ryan Teehan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new dataset and benchmarking framework to evaluate whether language models can effectively employ the Socratic method for code debugging, aiming to enhance automated teaching tools.

Contribution

It presents a manually curated dataset for Socratic debugging and benchmarks various language models' abilities to guide novice programmers in fixing bugs.

Findings

01

GPT-4 with chain of thought prompting performs best

02

Fine-tuned Flan-T5 shows moderate success

03

Zero-shot GPT-4 outperforms other models

Abstract

When employing the Socratic method of teaching, instructors guide students toward solving a problem on their own rather than providing the solution directly. While this strategy can substantially improve learning outcomes, it is usually time-consuming and cognitively demanding. Automated Socratic conversational agents can augment human instruction and provide the necessary scale, however their development is hampered by the lack of suitable data for training and evaluation. In this paper, we introduce a manually created dataset of multi-turn Socratic advice that is aimed at helping a novice programmer fix buggy solutions to simple computational problems. The dataset is then used for benchmarking the Socratic debugging abilities of a number of language models, ranging from fine-tuning the instruction-based text-to-text transformer Flan-T5 to zero-shot and chain of thought prompting of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

taisazero/socratic-debugging-benchmark
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Software Engineering Research · Intelligent Tutoring Systems and Adaptive Learning

MethodsAttention Is All You Need · Dropout · Dense Connections · Linear Layer · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection · Multi-Head Attention · Layer Normalization