Math Education Digital Shadows for facilitating learning with LLMs: Math performance, anxiety and confidence in simulated students and AIs
Naomi Esposito, Anthony Tricarico, Luisa Porzio, Ali Aghazadeh Ardebili, Massimo Stella

TL;DR
This paper introduces MEDS, a comprehensive dataset of 28,000 personas across 14 LLMs, capturing their reasoning, perceptions, and biases in math tasks to improve AI-driven math education.
Contribution
The paper presents MEDS, a novel dataset integrating math reasoning, attitudes, and sociodemographic data across diverse LLMs and human-like conditions for educational AI research.
Findings
LLMs exhibit family-specific biases like negative math attitudes.
Data validation confirms schema integrity and persona consistency.
MEDS captures complex psychological and cognitive aspects beyond traditional benchmarks.
Abstract
To enhance LLMs' impact on math education, we need data on their mathematical prowess and biases across prompts. To fill this gap, we introduce MEDS (Math Education Digital Shadows) as a dataset mapping how large language models reason about and report mathematics across human- and AI-like conditions. MEDS involves 28,000 personas from 14 LLMs (from families like Mistral, Qwen, DeepSeek, Granite, Phi and Grok) shadowing either humans or AI assistants. Each record/shadow includes a set of prompts along with psychological/sociodemographic persona metadata and four types of math tasks: (i) open math interview, (ii) three psychometric tests about math perceptions with explanations, (iii) cognitive networks capturing math attitudes, and (iv) 18 high-school math test questions together with their reasoning and confidence scores. MEDS differs from traditional score-only math benchmarks because…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
