Mind Your Outliers! Investigating the Negative Impact of Outliers on   Active Learning for Visual Question Answering

Siddharth Karamcheti; Ranjay Krishna; Li Fei-Fei; Christopher D.; Manning

arXiv:2107.02331·cs.CL·July 7, 2021

Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering

Siddharth Karamcheti, Ranjay Krishna, Li Fei-Fei, Christopher D., Manning

PDF

Open Access 1 Repo

TL;DR

This paper investigates why active learning often underperforms in visual question answering, identifying collective outliers as a key issue and proposing strategies to mitigate their negative impact.

Contribution

It uncovers the problem of collective outliers in active learning for VQA and provides insights and recommendations to improve sample efficiency.

Findings

01

Active learning often fails to outperform random selection in VQA.

02

Collective outliers are groups of examples that hinder learning.

03

Reducing collective outliers improves active learning efficiency.

Abstract

Active learning promises to alleviate the massive data needs of supervised machine learning: it has successfully improved sample efficiency by an order of magnitude on traditional tasks like topic classification and object recognition. However, we uncover a striking contrast to this promise: across 5 models and 4 datasets on the task of visual question answering, a wide variety of active learning approaches fail to outperform random selection. To understand this discrepancy, we profile 8 active learning methods on a per-example basis, and identify the problem as collective outliers -- groups of examples that active learning methods prefer to acquire but models fail to learn (e.g., questions that ask about text in images or require external knowledge). Through systematic ablation experiments and qualitative visualizations, we verify that collective outliers are a general phenomenon…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

siddk/vqa-outliers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning