Towards Valid Student Simulation with Large Language Models
Zhihao Yuan, Yunze Xiao, Ming Li, Weihao Xuan, Richard Tong, Mona Diab, Tom Mitchell

TL;DR
This paper develops a conceptual framework for creating valid student simulations using large language models, emphasizing epistemic fidelity over surface realism to improve educational research and practice.
Contribution
It introduces an epistemic state specification and a goal-by-environment framework to enhance the validity of LLM-based student simulations, addressing core limitations in current approaches.
Findings
Identifies the competence paradox as a key challenge.
Proposes a constrained generation approach for simulation fidelity.
Synthesizes literature and outlines open challenges in the field.
Abstract
This paper presents a conceptual and methodological framework for large language model (LLM) based student simulation in educational settings. The authors identify a core failure mode, termed the "competence paradox" in which broadly capable LLMs are asked to emulate partially knowledgeable learners, leading to unrealistic error patterns and learning dynamics. To address this, the paper reframes student simulation as a constrained generation problem governed by an explicit Epistemic State Specification (ESS), which defines what a simulated learner can access, how errors are structured, and how learner state evolves over time. The work further introduces a Goal-by-Environment framework to situate simulated student systems according to behavioral objectives and deployment contexts. Rather than proposing a new system or benchmark, the paper synthesizes prior literature, formalizes key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Innovative Teaching and Learning Methods · Online Learning and Analytics
