Scene Graph Generation with Geometric Context

Vishal Kumar; Albert Mundu; Satish Kumar Singh

arXiv:2111.13131·cs.CV·November 29, 2021

Scene Graph Generation with Geometric Context

Vishal Kumar, Albert Mundu, Satish Kumar Singh

PDF

Open Access

TL;DR

This paper introduces a geometric post-processing algorithm called Geometric Context to enhance scene graph generation by refining relationships between objects, improving understanding in various vision tasks.

Contribution

The work presents a novel geometric context algorithm that refines scene graph relationships, extending the KERN baseline and achieving competitive results.

Findings

01

Improved relationship modeling between objects using geometric cues.

02

Enhanced scene graph accuracy with the proposed post-processing.

03

Comparable performance to state-of-the-art methods.

Abstract

Scene Graph Generation has gained much attention in computer vision research with the growing demand in image understanding projects like visual question answering, image captioning, self-driving cars, crowd behavior analysis, activity recognition, and more. Scene graph, a visually grounded graphical structure of an image, immensely helps to simplify the image understanding tasks. In this work, we introduced a post-processing algorithm called Geometric Context to understand the visual scenes better geometrically. We use this post-processing algorithm to add and refine the geometric relationships between object pairs to a prior model. We exploit this context by calculating the direction and distance between object pairs. We use Knowledge Embedded Routing Network (KERN) as our baseline model, extend the work with our algorithm, and show comparable results on the recent state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Video Analysis and Summarization