TL;DR
SCRIBE is a framework that enables small, open-source language models to perform multi-hop, tool-augmented reasoning for educational feedback, ensuring privacy, reliability, and pedagogical validity.
Contribution
It introduces SCRIBE, a novel multi-hop reasoning framework with tool use and error recovery, fine-tuned on synthetic data for small models in educational contexts.
Findings
8B-SCRIBE models match or surpass larger models in relevance and actionability.
Models are perceived comparable to GPT-4o and Llama-3.3 70B by students.
Demonstrates viability of low-resource models for privacy-sensitive educational tasks.
Abstract
Language models can be used to provide interactive, personalized student feedback in educational settings. However, real-world deployment faces three key challenges: privacy concerns, limited computational resources, and the need for pedagogically valid responses. These constraints require small, open-source models that can run locally and reliably ground their outputs in correct information. We introduce SCRIBE, a framework for multi-hop, tool-augmented reasoning designed to generate valid responses to student questions about feedback reports. SCRIBE combines domain-specific tools with a self-reflective inference pipeline that supports iterative reasoning, tool use, and error recovery. We distil these capabilities into 3B and 8B models via two-stage LoRA fine-tuning on synthetic GPT-4o-generated data. Evaluation with a human-aligned GPT-Judge and a user study with 108 students shows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
