An Agentic Evaluation Architecture for Historical Bias Detection in Educational Textbooks

Gabriel Stefan; Adrian-Marius Dumitran

arXiv:2604.07883·cs.AI·April 10, 2026

An Agentic Evaluation Architecture for Historical Bias Detection in Educational Textbooks

Gabriel Stefan, Adrian-Marius Dumitran

PDF

TL;DR

This paper introduces an agentic evaluation architecture for detecting biases in educational textbooks, combining multimodal screening, diverse evaluative agents, and a source attribution protocol to improve accuracy and reduce false positives.

Contribution

The paper presents a novel agentic evaluation framework with a source attribution protocol, demonstrating improved bias detection and cost-effectiveness in analyzing history textbooks.

Findings

01

83.3% of textbook excerpts classified as pedagogically acceptable

02

Agentic evaluation reduced false positives compared to baseline

03

Preferred in 64.8% of human evaluations over baselines

Abstract

History textbooks often contain implicit biases, nationalist framing, and selective omissions that are difficult to audit at scale. We propose an agentic evaluation architecture comprising a multimodal screening agent, a heterogeneous jury of five evaluative agents, and a meta-agent for verdict synthesis and human escalation. A central contribution is a Source Attribution Protocol that distinguishes textbook narrative from quoted historical sources, preventing the misattribution that causes systematic false positives in single-model evaluators. In an empirical study on Romanian upper-secondary history textbooks, 83.3\% of 270 screened excerpts were classified as pedagogically acceptable (mean severity 2.9/7), versus 5.4/7 under a zero-shot baseline, demonstrating that agentic deliberation mitigates over-penalization. In a blind human evaluation (18 evaluators, 54 comparisons), the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.