Graphhopper: Multi-Hop Scene Graph Reasoning for Visual Question   Answering

Rajat Koner; Hang Li; Marcel Hildebrandt; Deepan Das; Volker Tresp,; Stephan G\"unnemann

arXiv:2107.06325·cs.CV·July 15, 2021

Graphhopper: Multi-Hop Scene Graph Reasoning for Visual Question Answering

Rajat Koner, Hang Li, Marcel Hildebrandt, Deepan Das, Volker Tresp,, Stephan G\"unnemann

PDF

1 Repo

TL;DR

Graphhopper introduces a multi-hop scene graph reasoning approach for visual question answering, combining knowledge graph navigation with reinforcement learning to improve reasoning accuracy on complex image questions.

Contribution

It presents a novel multi-hop reasoning method using reinforcement learning over scene graphs for VQA, achieving competitive performance with state-of-the-art models.

Findings

01

Performs on par with human accuracy on manually curated scene graphs.

02

Outperforms existing scene graph reasoning models on GQA dataset.

03

Effective in both manually curated and automatically generated scene graphs.

Abstract

Visual Question Answering (VQA) is concerned with answering free-form questions about an image. Since it requires a deep semantic and linguistic understanding of the question and the ability to associate it with various objects that are present in the image, it is an ambitious task and requires multi-modal reasoning from both computer vision and natural language processing. We propose Graphhopper, a novel method that approaches the task by integrating knowledge graph reasoning, computer vision, and natural language processing techniques. Concretely, our method is based on performing context-driven, sequential reasoning based on the scene entities and their semantic and spatial relationships. As a first step, we derive a scene graph that describes the objects in the image, as well as their attributes and their mutual relationships. Subsequently, a reinforcement learning agent is trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rajatkoner08/Graphhopper
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.