Who Gets the Kidney? Human-AI Alignment, Indecision, and Moral Values
John P. Dickerson, Hadi Hosseini, Samarth Khanna, Leona Pierce

TL;DR
This paper evaluates how large language models behave in kidney allocation scenarios, revealing deviations from human values and limited indecision, and explores fine-tuning methods to improve alignment.
Contribution
It systematically assesses LLMs in moral decision-making contexts and demonstrates effective fine-tuning approaches for better alignment with human preferences.
Findings
LLMs deviate from human moral priorities in resource allocation.
LLMs rarely show indecision, favoring deterministic choices.
Fine-tuning improves decision consistency and indecision modeling.
Abstract
The rapid integration of Large Language Models (LLMs) in high-stakes decision-making -- such as allocating scarce resources like donor organs -- raises critical questions about their alignment with human moral values. We systematically evaluate the behavior of several prominent LLMs against human preferences in kidney allocation scenarios and show that LLMs: i) exhibit stark deviations from human values in prioritizing various attributes, and ii) in contrast to humans, LLMs rarely express indecision, opting for deterministic decisions even when alternative indecision mechanisms (e.g., coin flipping) are provided. Nonetheless, we show that low-rank supervised fine-tuning with few samples is often effective in improving both decision consistency and calibrating indecision modeling. These findings illustrate the necessity of explicit alignment strategies for LLMs in moral/ethical domains.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
