Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values
Hadi Hosseini, Samarth Khanna

TL;DR
This paper evaluates large language models' ability to adhere to fairness principles in resource distribution, revealing current limitations in aligning with human preferences and exploring potential strategies for improvement.
Contribution
It provides a comprehensive benchmark of LLMs' fairness responses and analyzes their robustness and potential for better alignment with human values.
Findings
LLMs currently do not align well with human fairness preferences.
They cannot effectively use money as a transferable resource to reduce inequality.
Response robustness varies with semantic and non-semantic prompt changes.
Abstract
The growing interest in employing large language models (LLMs) for decision-making in social and economic contexts has raised questions about their potential to function as agents in these domains. A significant number of societal problems involve the distribution of resources, where fairness, along with economic efficiency, play a critical role in the desirability of outcomes. In this paper, we examine whether LLM responses adhere to fundamental fairness concepts such as equitability, envy-freeness, and Rawlsian maximin, and investigate their alignment with human preferences. We evaluate the performance of several LLMs, providing a comparative benchmark of their ability to reflect these measures. Our results demonstrate a lack of alignment between current LLM responses and human distributional preferences. Moreover, LLMs are unable to utilize money as a transferable resource to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEthics and Social Impacts of AI
