NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI Tutors

Numaan Naeem; Sarfraz Ahmad; Momina Ahsan; Hasan Iqbal

arXiv:2506.10627·cs.CL·June 13, 2025

NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI Tutors

Numaan Naeem, Sarfraz Ahmad, Momina Ahsan, Hasan Iqbal

PDF

Open Access 1 Repo

TL;DR

This paper introduces a retrieval-augmented prompting system using GPT-4 for mistake identification in AI tutors, combining multiple models and retrieval techniques to improve pedagogical feedback accuracy.

Contribution

It presents a novel retrieval-augmented prompting approach that enhances mistake detection in AI tutoring systems, outperforming baseline methods.

Findings

01

Retrieval-augmented prompting improves mistake identification accuracy.

02

Combining multiple models yields better performance than individual approaches.

03

The system provides interpretable, schema-guided predictions.

Abstract

This paper presents our system for Track 1: Mistake Identification in the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors. The task involves evaluating whether a tutor's response correctly identifies a mistake in a student's mathematical reasoning. We explore four approaches: (1) an ensemble of machine learning models over pooled token embeddings from multiple pretrained language models (LMs); (2) a frozen sentence-transformer using [CLS] embeddings with an MLP classifier; (3) a history-aware model with multi-head attention between token-level history and response embeddings; and (4) a retrieval-augmented few-shot prompting system with a large language model (LLM) i.e. GPT 4o. Our final system retrieves semantically similar examples, constructs structured prompts, and uses schema-guided output parsing to produce interpretable predictions. It outperforms all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

naumannaeem/bea_2025
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Topic Modeling · Model Reduction and Neural Networks

MethodsCosine Annealing · Layer Normalization · Linear Warmup With Cosine Annealing · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Discriminative Fine-Tuning · Byte Pair Encoding · Softmax · Linear Layer · Dropout