Compounding Disadvantage: Auditing Intersectional Bias in LLM-Generated Explanations Across Indian and American STEM Education
Amogh Gupta, Niharika Patil, Sourojit Ghosh, SnehalKumar (Neil) S Gaikwad

TL;DR
This study reveals that large language models systematically disadvantage marginalized students in STEM education across Indian and American contexts, with biases persisting across models and institutions.
Contribution
It provides a comprehensive intersectional bias audit of four LLMs in cross-cultural educational settings, highlighting the need for structural bias mitigation.
Findings
Marginalized profiles face up to 2.55 grade level disadvantages.
Income consistently influences model outputs across contexts.
Biases compound non-additively across multiple marginalized dimensions.
Abstract
Large language models are increasingly deployed in STEM education for personalized instruction and feedback across institutions in high- and low-income countries. These systems are designed to adapt content to student needs, but whether they adapt based on demonstrated ability or demographic signals remains untested at scale. Here we establish that LLM-generated STEM content systematically disadvantages marginalized student profiles across two cultural contexts, with the gap between the most privileged and most marginalized profiles reaching 2.55 grade levels. We audited four LLMs (Qwen 2.5-32B-Instruct, GPT-4o, GPT-4o-mini, GPT-OSS 20B) using synthetic profiles crossing dimensions specific to Indian education (caste, medium of instruction, college tier) and American education (race, HBCU attendance, school type), alongside income, gender, and disability, across ranking and generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
