Fairness Aware Reinforcement Learning via Proximal Policy Optimization

Gabriele La Malfa; Jie M. Zhang; Michael Luck; Elizabeth Black

arXiv:2502.03953·cs.MA·September 3, 2025

Fairness Aware Reinforcement Learning via Proximal Policy Optimization

Gabriele La Malfa, Jie M. Zhang, Michael Luck, Elizabeth Black

PDF

Open Access 1 Video

TL;DR

This paper proposes Fair-PPO, a reinforcement learning algorithm that incorporates fairness penalties based on demographic parity, balancing reward maximization with equitable outcomes in multi-agent systems.

Contribution

It introduces a novel fairness-aware PPO method with retrospective and prospective penalty components, improving fairness in multi-agent reinforcement learning scenarios.

Findings

01

Fair-PPO achieves fairer policies than standard PPO.

02

The method balances fairness with performance, maintaining efficiency.

03

It reveals diverse strategies to enhance fairness in multi-agent systems.

Abstract

Fairness in multi-agent systems (MAS) focuses on equitable reward distribution among agents in scenarios involving sensitive attributes such as race, gender, or socioeconomic status. This paper introduces fairness in Proximal Policy Optimization (PPO) with a penalty term derived from a fairness definition such as demographic parity, counterfactual fairness, or conditional statistical parity. The proposed method, which we call Fair-PPO, balances reward maximisation with fairness by integrating two penalty components: a retrospective component that minimises disparities in past outcomes and a prospective component that ensures fairness in future decision-making. We evaluate our approach in two games: the Allelopathic Harvest, a cooperative and competitive MAS focused on resource collection, where some agents possess a sensitive attribute, and HospitalSim, a hospital simulation, in which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Fairness Aware Reinforcement Learning via Proximal Policy Optimization· underline

Taxonomy

TopicsBehavioral Health and Interventions · Ethics and Social Impacts of AI · Human-Automation Interaction and Safety

MethodsEntropy Regularization · Mixing Adam and SGD · Proximal Policy Optimization