Fairness in Learning: Classic and Contextual Bandits

Matthew Joseph; Michael Kearns; Jamie Morgenstern; Aaron Roth

arXiv:1605.07139·cs.LG·November 8, 2016·192 cites

Fairness in Learning: Classic and Contextual Bandits

Matthew Joseph, Michael Kearns, Jamie Morgenstern, Aaron Roth

PDF

Open Access

TL;DR

This paper explores fairness in multi-armed bandit algorithms, proposing fair algorithms with regret bounds, establishing fundamental limits, and connecting fairness with KWIK learning to address both stochastic and contextual bandit problems.

Contribution

It introduces a formal fairness definition for bandits, develops fair algorithms with regret bounds, and links fairness to KWIK learning for contextual bandits, highlighting fundamental trade-offs.

Findings

01

Fair algorithms have cubic regret dependence on the number of arms.

02

Any fair algorithm must have at least cubic regret dependence, indicating inherent limitations.

03

A tight connection between fairness and KWIK learning enables new fair algorithms for linear contextual bandits.

Abstract

We introduce the study of fairness in multi-armed bandit problems. Our fairness definition can be interpreted as demanding that given a pool of applicants (say, for college admission or mortgages), a worse applicant is never favored over a better one, despite a learning algorithm's uncertainty over the true payoffs. We prove results of two types. First, in the important special case of the classic stochastic bandits problem (i.e., in which there are no contexts), we provide a provably fair algorithm based on "chained" confidence intervals, and provide a cumulative regret bound with a cubic dependence on the number of arms. We further show that any fair algorithm must have such a dependence. When combined with regret bounds for standard non-fair algorithms such as UCB, this proves a strong separation between fair and unfair learning, which extends to the general contextual case. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms