Learning to Reason Iteratively and Parallelly for Complex Visual   Reasoning Scenarios

Shantanu Jaiswal; Debaditya Roy; Basura Fernando; Cheston Tan

arXiv:2411.13754·cs.LG·November 22, 2024

Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios

Shantanu Jaiswal, Debaditya Roy, Basura Fernando, Cheston Tan

PDF

Open Access 1 Video

TL;DR

This paper introduces IPRM, a neural reasoning mechanism combining iterative and parallel computation to enhance complex visual question answering, demonstrating superior performance and interpretability across multiple benchmarks.

Contribution

The paper proposes a novel neural reasoning module, IPRM, that integrates iterative and parallel processes for improved complex visual reasoning in VQA tasks.

Findings

01

Outperforms prior methods on various VQA benchmarks

02

Enables visualization of reasoning steps for interpretability

03

Effective in diverse complex reasoning scenarios

Abstract

Complex visual reasoning and question answering (VQA) is a challenging task that requires compositional multi-step processing and higher-level reasoning capabilities beyond the immediate recognition and localization of objects and events. Here, we introduce a fully neural Iterative and Parallel Reasoning Mechanism (IPRM) that combines two distinct forms of computation -- iterative and parallel -- to better address complex VQA scenarios. Specifically, IPRM's "iterative" computation facilitates compositional step-by-step reasoning for scenarios wherein individual operations need to be computed, stored, and recalled dynamically (e.g. when computing the query "determine the color of pen to the left of the child in red t-shirt sitting at the white table"). Meanwhile, its "parallel" computation allows for the simultaneous exploration of different reasoning paths and benefits more robust and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios· slideslive

Taxonomy

TopicsAI-based Problem Solving and Planning

MethodsSoftmax · Attention Is All You Need