A Multi-World Approach to Question Answering about Real-World Scenes   based on Uncertain Input

Mateusz Malinowski; Mario Fritz

arXiv:1410.0210·cs.AI·May 6, 2015·258 cites

A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

Mateusz Malinowski, Mario Fritz

PDF

Open Access

TL;DR

This paper introduces a multi-world Bayesian approach to answer complex questions about real-world images, integrating NLP and computer vision to handle uncertainty and provide diverse answer types.

Contribution

It presents a novel multi-world Bayesian framework for visual question answering that manages uncertainty and handles complex, realistic scene questions.

Findings

01

Established a new benchmark for visual question answering.

02

Demonstrated the system's ability to handle complex, high-uncertainty questions.

03

Achieved diverse answer types including counts, object classes, and lists.

Abstract

We propose a method for automatically answering questions about images by bringing together recent advances from natural language processing and computer vision. We combine discrete reasoning with uncertain predictions by a multi-world approach that represents uncertainty about the perceived world in a bayesian framework. Our approach can handle human questions of high complexity about realistic scenes and replies with range of answer like counts, object classes, instances and lists of them. The system is directly trained from question-answer pairs. We establish a first benchmark for this task that can be seen as a modern attempt at a visual turing test.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques