# Bias in Bios: A Case Study of Semantic Representation Bias in a   High-Stakes Setting

**Authors:** Maria De-Arteaga, Alexey Romanov, Hanna Wallach, Jennifer Chayes,, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi,, Adam Tauman Kalai

arXiv: 1901.09451 · 2019-01-29

## TL;DR

This study investigates gender bias in occupation classification from online biographies, revealing how explicit gender indicators and their removal influence bias and potentially reinforce occupational gender imbalances.

## Contribution

It provides a large-scale analysis of semantic representation bias in high-stakes occupation classification, highlighting the effects of explicit gender indicators and proxy behaviors.

## Key findings

- Explicit gender indicators affect true positive rates by gender.
- Removing indicators does not eliminate bias due to proxies.
- Bias correlates with existing occupational gender imbalances.

## Abstract

We present a large-scale study of gender bias in occupation classification, a task where the use of machine learning may lead to negative outcomes on peoples' lives. We analyze the potential allocation harms that can result from semantic representation bias. To do so, we study the impact on occupation classification of including explicit gender indicators---such as first names and pronouns---in different semantic representations of online biographies. Additionally, we quantify the bias that remains when these indicators are "scrubbed," and describe proxy behavior that occurs in the absence of explicit gender indicators. As we demonstrate, differences in true positive rates between genders are correlated with existing gender imbalances in occupations, which may compound these imbalances.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.09451/full.md

## Figures

19 figures with captions in the complete paper: https://tomesphere.com/paper/1901.09451/full.md

## References

45 references — full list in the complete paper: https://tomesphere.com/paper/1901.09451/full.md

---
Source: https://tomesphere.com/paper/1901.09451