A Tale Of Two Long Tails

Daniel D'souza; Zach Nussbaum; Chirag Agarwal; Sara Hooker

arXiv:2107.13098·cs.CV·July 29, 2021·1 cites

A Tale Of Two Long Tails

Daniel D'souza, Zach Nussbaum, Chirag Agarwal, Sara Hooker

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the sources of uncertainty in machine learning models, proposing targeted data augmentation during training to better understand and differentiate between types of uncertain examples.

Contribution

It introduces a method for targeted data augmentation to analyze and distinguish different sources of uncertainty during model training.

Findings

01

Targeted augmentation improves understanding of uncertainty sources.

02

Atypical and noisy examples learn at different rates.

03

Interventions can effectively characterize uncertainty types.

Abstract

As machine learning models are increasingly employed to assist human decision-makers, it becomes critical to communicate the uncertainty associated with these model predictions. However, the majority of work on uncertainty has focused on traditional probabilistic or ranking approaches - where the model assigns low probabilities or scores to uncertain examples. While this captures what examples are challenging for the model, it does not capture the underlying source of the uncertainty. In this work, we seek to identify examples the model is uncertain about and characterize the source of said uncertainty. We explore the benefits of designing a targeted intervention - targeted data augmentation of the examples where the model is uncertain over the course of training. We investigate whether the rate of learning in the presence of additional information differs between atypical and noisy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dsouzadaniel/long_tail
pytorchOfficial

Videos

#92 - SARA HOOKER - Fairness, Interpretability, Language Models· youtube

Taxonomy

TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning