Decoding fairness: a reinforcement learning perspective

Guozhong Zheng; Jiqiang Zhang; Xin Ou; Shengfeng Deng; and Li Chen

arXiv:2412.16249·cs.LG·February 4, 2026

Decoding fairness: a reinforcement learning perspective

Guozhong Zheng, Jiqiang Zhang, Xin Ou, Shengfeng Deng, and Li Chen

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that fairness in the ultimatum game can emerge endogenously through reinforcement learning, specifically Q-learning, without relying on external factors, aligning with behavioral experiment observations.

Contribution

It introduces a reinforcement learning framework for the ultimatum game showing fairness emerges naturally from reward maximization, challenging exogenous explanations.

Findings

01

Fairness emerges when players consider future rewards and experiences.

02

The system stabilizes into fair or rational strategies over time.

03

Results are robust across different role assignment methods and population structures.

Abstract

Behavioral experiments on the ultimatum game (UG) reveal that we humans prefer fair acts, which contradicts the prediction made in orthodox Economics. Existing explanations, however, are mostly attributed to exogenous factors within the imitation learning framework. Here, we adopt the reinforcement learning paradigm, where individuals make their moves aiming to maximize their accumulated rewards. Specifically, we apply Q-learning to UG, where each player is assigned two Q-tables to guide decisions for the roles of proposer and responder. In a two-player scenario, fairness emerges prominently when both experiences and future rewards are appreciated. In particular, the probability of successful deals increases with higher offers, which aligns with observations in behavioral experiments. Our mechanism analysis reveals that the system undergoes two phases, eventually stabilizing into fair…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chenli-lab/RL-Fairness
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExperimental Behavioral Economics Studies · Economic and Technological Innovation

MethodsADaptive gradient method with the OPTimal convergence rate · Q-Learning