Towards Reliable Generative AI-Driven Scaffolding: Reducing Hallucinations and Enhancing Quality in Self-Regulated Learning Support

Keyang Qian; Shiqi Liu; Tongguang Li; Mladen Rakovi\'c; Xinyu Li; Rui Guan; Inge Molenaar; Sadia Nawaz; Zachari Swiecki; Lixiang Yan; Dragan Ga\v{s}evi\'c

arXiv:2508.05929·cs.CY·September 11, 2025

Towards Reliable Generative AI-Driven Scaffolding: Reducing Hallucinations and Enhancing Quality in Self-Regulated Learning Support

Keyang Qian, Shiqi Liu, Tongguang Li, Mladen Rakovi\'c, Xinyu Li, Rui Guan, Inge Molenaar, Sadia Nawaz, Zachari Swiecki, Lixiang Yan, Dragan Ga\v{s}evi\'c

PDF

TL;DR

This paper presents two novel evaluation methods for AI-generated educational scaffolds that significantly reduce hallucinations and improve quality, enhancing the reliability of self-regulated learning support systems.

Contribution

It introduces multi-agent reliability evaluation and LLM-as-a-Judge techniques to assess and improve the quality of generative AI scaffolds in education.

Findings

01

Reliability evaluation outperforms baselines and aligns with human judgment.

02

Both methods effectively reduce hallucinations in generated scaffolds.

03

Identified bias limitations in the LLM-as-a-Judge approach.

Abstract

Generative Artificial Intelligence (GenAI) holds a potential to advance existing educational technologies with capabilities to automatically generate personalised scaffolds that support students' self-regulated learning (SRL). While advancements in large language models (LLMs) promise improvements in the adaptability and quality of educational technologies for SRL, there remain concerns about the hallucinations in content generated by LLMs, which can compromise both the learning experience and ethical standards. To address these challenges, we proposed GenAI-enabled approaches for evaluating personalised SRL scaffolds before they are presented to students, aiming for reducing hallucinations and improving the overall quality of LLM-generated personalised scaffolds. Specifically, two approaches are investigated. The first approach involved developing a multi-agent system approach for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.