Adversarial Filters of Dataset Biases

Ronan Le Bras; Swabha Swayamdipta; Chandra Bhagavatula; Rowan Zellers,; Matthew E. Peters; Ashish Sabharwal; Yejin Choi

arXiv:2002.04108·cs.LG·July 14, 2020·125 cites

Adversarial Filters of Dataset Biases

Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula, Rowan Zellers,, Matthew E. Peters, Ashish Sabharwal, Yejin Choi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper examines AFLite, an adversarial filtering method that reduces dataset biases in NLP and vision tasks, leading to improved out-of-distribution generalization but also revealing new robustness challenges.

Contribution

It provides a theoretical framework for AFLite, demonstrates its effectiveness in bias reduction, and discusses implications for robust model evaluation.

Findings

01

AFLite effectively reduces measurable dataset biases.

02

Models trained on filtered datasets generalize better out-of-distribution.

03

Filtering causes significant performance drops, highlighting robustness challenges.

Abstract

Large neural models have demonstrated human-level performance on language and vision benchmarks, while their performance degrades considerably on adversarial or out-of-distribution samples. This raises the question of whether these models have learned to solve a dataset rather than the underlying task by overfitting to spurious dataset biases. We investigate one recently proposed approach, AFLite, which adversarially filters such dataset biases, as a means to mitigate the prevalent overestimation of machine performance. We provide a theoretical understanding for AFLite, by situating it in the generalized framework for optimum bias reduction. We present extensive supporting evidence that AFLite is broadly applicable for reduction of measurable dataset biases, and that models trained on the filtered datasets yield better generalization to out-of-distribution tasks. Finally, filtering…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

swabhs/notebooks_for_aflite
none

Videos

Adversarial Filters of Dataset Biases· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning