Image Semantic Relation Generation

Mingzhe Du

arXiv:2210.11253·cs.CV·October 21, 2022

Image Semantic Relation Generation

Mingzhe Du

PDF

Open Access

TL;DR

This paper introduces ISRG, a new image-to-text model that simplifies scene graph generation by decoupling object detection and relation prediction, achieving significant performance improvements on the OpenPSG dataset.

Contribution

The paper proposes a novel two-step approach for scene graph generation, reducing annotation costs and improving accuracy over existing methods.

Findings

01

Achieved 31 points on OpenPSG dataset.

02

Outperformed ResNet-50 baseline by 16 points.

03

Outperformed CLIP baseline by 5 points.

Abstract

Scene graphs provide structured semantic understanding beyond images. For downstream tasks, such as image retrieval, visual question answering, visual relationship detection, and even autonomous vehicle technology, scene graphs can not only distil complex image information but also correct the bias of visual models using semantic-level relations, which has broad application prospects. However, the heavy labour cost of constructing graph annotations may hinder the application of PSG in practical scenarios. Inspired by the observation that people usually identify the subject and object first and then determine the relationship between them, we proposed to decouple the scene graphs generation task into two sub-tasks: 1) an image segmentation task to pick up the qualified objects. 2) a restricted auto-regressive text generation task to generate the relation between given objects. Therefore,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques