Do Large Language Models Walk Their Talk? Measuring the Gap Between Implicit Associations, Self-Report, and Behavioral Altruism
Sandro Andric

TL;DR
This study assesses whether large language models exhibit genuine altruism by comparing their implicit biases, self-reports, and actual behaviors, revealing a significant overestimation of altruism in self-assessments and a weak link between implicit bias and behavior.
Contribution
It introduces a multi-method approach inspired by social psychology to evaluate altruism in LLMs and proposes the Calibration Gap as a new alignment metric.
Findings
Models show strong implicit pro-altruism bias.
Models behave more altruistically than chance, with variation.
Implicit associations do not predict actual altruistic behavior.
Abstract
We investigate whether Large Language Models (LLMs) exhibit altruistic tendencies, and critically, whether their implicit associations and self-reports predict actual altruistic behavior. Using a multi-method approach inspired by human social psychology, we tested 24 frontier LLMs across three paradigms: (1) an Implicit Association Test (IAT) measuring implicit altruism bias, (2) a forced binary choice task measuring behavioral altruism, and (3) a self-assessment scale measuring explicit altruism beliefs. Our key findings are: (1) All models show strong implicit pro-altruism bias (mean IAT = 0.87, p < .0001), confirming models "know" altruism is good. (2) Models behave more altruistically than chance (65.6% vs. 50%, p < .0001), but with substantial variation (48-85%). (3) Implicit associations do not predict behavior (r = .22, p = .29). (4) Most critically, models systematically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Language and cultural evolution · Artificial Intelligence in Healthcare and Education
