A Comprehensive Survey of Scene Graphs: Generation and Application

Xiaojun Chang; Pengzhen Ren; Pengfei Xu; Zhihui Li; Xiaojiang Chen,; and Alex Hauptmann

arXiv:2104.01111·cs.CV·January 10, 2022

A Comprehensive Survey of Scene Graphs: Generation and Application

Xiaojun Chang, Pengzhen Ren, Pengfei Xu, Zhihui Li, Xiaojiang Chen,, and Alex Hauptmann

PDF

TL;DR

This survey comprehensively reviews scene graphs, covering their generation methods, applications, datasets, and future directions, highlighting their importance in advanced scene understanding and reasoning tasks.

Contribution

It provides the first systematic and comprehensive overview of scene graph research, including generation techniques, applications, datasets, and future insights.

Findings

01

Summarizes various scene graph generation methods.

02

Details key applications like image captioning and VQA.

03

Lists major datasets used in scene graph research.

Abstract

Scene graph is a structured representation of a scene that can clearly express the objects, attributes, and relationships between objects in the scene. As computer vision technology continues to develop, people are no longer satisfied with simply detecting and recognizing objects in images; instead, people look forward to a higher level of understanding and reasoning about visual scenes. For example, given an image, we want to not only detect and recognize objects in the image, but also know the relationship between objects (visual relationship detection), and generate a text description (image captioning) based on the image content. Alternatively, we might want the machine to tell us what the little girl in the image is doing (Visual Question Answering (VQA)), or even remove the dog from the image and find similar images (image editing and retrieval), etc. These tasks require a higher…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.