Beyond Human Judgment: A Bayesian Evaluation of LLMs' Moral Values Understanding

Maciej Skorski; Alina Landowska

arXiv:2508.13804·cs.CL·November 24, 2025

Beyond Human Judgment: A Bayesian Evaluation of LLMs' Moral Values Understanding

Maciej Skorski, Alina Landowska

PDF

Open Access 1 Video

TL;DR

This paper introduces a Bayesian framework to evaluate how well large language models understand moral values, accounting for human disagreement and model uncertainty, and finds that AI models perform comparably to top human annotators in moral detection.

Contribution

It presents the first large-scale Bayesian evaluation of LLMs' moral understanding, modeling uncertainties and comparing AI performance to human annotators across diverse texts.

Findings

01

AI models rank in the top 25% of human annotators.

02

Models produce fewer false negatives than humans.

03

Bayesian framework effectively captures annotator disagreement.

Abstract

How do Large Language Models understand moral dimensions compared to humans? This first large-scale Bayesian evaluation of market-leading language models provides the answer. In contrast to prior work using deterministic ground truth (majority or inclusion rules), we model annotator disagreements to capture both aleatoric uncertainty (inherent human disagreement) and epistemic uncertainty (model domain sensitivity). We evaluated the best language models (Claude Sonnet 4, DeepSeek-V3, Llama 4 Maverick) across 250K+ annotations from nearly 700 annotators in 100K+ texts spanning social networks, news and forums. Our GPU-optimized Bayesian framework processed 1M+ model queries, revealing that AI models typically rank among the top 25\% of human annotators, performing much better than average balanced accuracy. Importantly, we find that AI produces far fewer false negatives than humans,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Beyond Human Judgment: A Bayesian Evaluation of LLMs' Moral Values Understanding· underline

Taxonomy

TopicsLegal Education and Practice Innovations