Transfer Learning via Unsupervised Task Discovery for Visual Question   Answering

Hyeonwoo Noh; Taehoon Kim; Jonghwan Mun; Bohyung Han

arXiv:1810.02358·cs.LG·April 9, 2019

Transfer Learning via Unsupervised Task Discovery for Visual Question Answering

Hyeonwoo Noh, Taehoon Kim, Jonghwan Mun, Bohyung Han

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to improve visual question answering by learning task-specific visual classifiers through unsupervised discovery, enabling better handling of out-of-vocabulary answers using visual and linguistic data.

Contribution

It proposes a novel approach to transfer unsupervised learned visual classifiers to VQA models, bridging visual data and question-dependent answering.

Findings

01

Successfully generalizes to out-of-vocabulary answers

02

Leverages structured lexical databases and visual descriptions

03

Enhances VQA performance with transferred visual classifiers

Abstract

We study how to leverage off-the-shelf visual and linguistic data to cope with out-of-vocabulary answers in visual question answering task. Existing large-scale visual datasets with annotations such as image class labels, bounding boxes and region descriptions are good sources for learning rich and diverse visual concepts. However, it is not straightforward how the visual concepts can be captured and transferred to visual question answering models due to missing link between question dependent answering models and visual data without question. We tackle this problem in two steps: 1) learning a task conditional visual classifier, which is capable of solving diverse question-specific visual recognition tasks, based on unsupervised task discovery and 2) transferring the task conditional visual classifier to visual question answering models. Specifically, we employ linguistic knowledge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HyeonwooNoh/vqa_task_discovery
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning