The Dataset Nutrition Label (2nd Gen): Leveraging Context to Mitigate   Harms in Artificial Intelligence

Kasia S. Chmielinski; Sarah Newman; Matt Taylor; Josh Joseph; Kemi; Thomas; Jessica Yurkofsky; Yue Chelsea Qiu

arXiv:2201.03954·cs.LG·March 11, 2022·28 cites

The Dataset Nutrition Label (2nd Gen): Leveraging Context to Mitigate Harms in Artificial Intelligence

Kasia S. Chmielinski, Sarah Newman, Matt Taylor, Josh Joseph, Kemi, Thomas, Jessica Yurkofsky, Yue Chelsea Qiu

PDF

Open Access

TL;DR

This paper introduces the second generation of the Dataset Nutrition Label, emphasizing context-aware features to help data scientists identify and mitigate biases and harms in datasets used for AI systems.

Contribution

It presents an updated, more comprehensive Dataset Nutrition Label with new design, context-specific use cases, and alerts to improve dataset transparency and bias mitigation.

Findings

01

New Label design and interface for data scientists

02

Inclusion of context-specific use cases and alerts

03

Application to additional datasets and ongoing challenges

Abstract

As the production of and reliance on datasets to produce automated decision-making systems (ADS) increases, so does the need for processes for evaluating and interrogating the underlying data. After launching the Dataset Nutrition Label in 2018, the Data Nutrition Project has made significant updates to the design and purpose of the Label, and is launching an updated Label in late 2020, which is previewed in this paper. The new Label includes context-specific Use Cases &Alerts presented through an updated design and user interface targeted towards the data scientist profile. This paper discusses the harm and bias from underlying training data that the Label is intended to mitigate, the current state of the work including new datasets being labeled, new and existing challenges, and further directions of the work, as well as Figures previewing the new label.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNutrition, Genetics, and Disease