Fair Off-Policy Learning from Observational Data

Dennis Frauen; Valentyn Melnychuk; Stefan Feuerriegel

arXiv:2303.08516·cs.LG·October 10, 2023·1 cites

Fair Off-Policy Learning from Observational Data

Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a neural network-based framework for fair off-policy learning from observational data, formalizing fairness notions, providing theoretical guarantees, and demonstrating effectiveness through experiments.

Contribution

It presents a novel framework for learning fair decision policies from observational data, including formal fairness definitions and theoretical generalization bounds.

Findings

01

Effective in achieving fairness in decision policies

02

Theoretical guarantees support the framework's reliability

03

Successful experiments on simulated and real-world data

Abstract

Algorithmic decision-making in practice must be fair for legal, ethical, and societal reasons. To achieve this, prior research has contributed various approaches that ensure fairness in machine learning predictions, while comparatively little effort has focused on fairness in decision-making, specifically off-policy learning. In this paper, we propose a novel framework for fair off-policy learning: we learn decision rules from observational data under different notions of fairness, where we explicitly assume that observational data were collected under a different potentially discriminatory behavioral policy. For this, we first formalize different fairness notions for off-policy learning. We then propose a neural network-based framework to learn optimal policies under different fairness notions. We further provide theoretical guarantees in the form of generalization bounds for the…

Peer Reviews

Decision·ICML 2024 Poster

Reviewer 01Rating 5· marginally below the acceptance thresholdConfidence 4

Strengths

1. Fairness in offline policy learning from observational data seems an interesting and important problem. 2. The paper is, for the most part, well-written and easy to follow. 3. They conducted both theoretical analysis and empirical evaluation on the proposed algorithm.

Weaknesses

1. One concern is that the paper does not discuss how they deal with observational data. In particular, the proposed method depends on several estimators like DM, IPW, DR. But some quantities are unknown in the observational setting, like \mu_1, \mu_0, \pi_b. I would be great if the authors can discuss how these quantities are obtained, and how they affect the theoretical and empirical results. 2. For empirical evaluation, especially for real-world data one, we do not know the counterfactual o

Reviewer 02Rating 8· accept, good paperConfidence 3

Strengths

The paper is very well written and easy to follow. The authors take considerable care when defining concepts in the paper to make things clear for the reader. The paper is novel in that it produces a framework that is considerably less restrictive than other related works. Specifically, the addition of a neural approach to fair off-policy learning. It is also general enough to fit many different contexts and needs of practitioners. The authors provide clear generalization bounds. The authors

Weaknesses

Mostly minor nits for weaknesses: It is unfortunate that no other baselines are available for this work. Although space is very limited, it would be good to include more discussion from Appendix I in the paper. The plots in the paper need to be more readable. Thicker lines and larger text to match the text size in the paper.

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 3

Strengths

1. The paper simultaneously consider action fairness and policy value fairness in off-policy learning. The results of relations between two notions of fairness are novel to my knowledge. 2. The paper is well-organized and easy to follow. In addition to the algorithm, it provides generalization bound and validation on both synthetic and real data.

Weaknesses

1. While the proposed method can learn optimal policies under both action and policy value fairness, the underlying technique seems to be very similar to the existing methods. The technical contribution of this paper is unclear to me. Specifically, the proposed solution includes two steps: the first step aims to learn fair representation to ensure action fairness, while the second step incorporates the policy value fairness constraints to the objective function. For the first step, the idea of l

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI