Structure Inference Net: Object Detection Using Scene-Level Context and   Instance-Level Relationships

Yong Liu; Ruiping Wang; Shiguang Shan; Xilin Chen

arXiv:1807.00119·cs.CV·July 3, 2018·24 cites

Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships

Yong Liu, Ruiping Wang, Shiguang Shan, Xilin Chen

PDF

Open Access

TL;DR

This paper introduces the Structure Inference Network (SIN), a novel object detection method that leverages scene-level context and object relationships modeled as a graph to improve detection accuracy.

Contribution

It formulates object detection as a graph structure inference problem and integrates scene context and object relationships into a detection framework.

Findings

01

Improved detection performance on PASCAL VOC and MS COCO datasets.

02

Scene context and object relationships enhance detection accuracy.

03

Outputs are more reasonable and contextually consistent.

Abstract

Context is important for accurate visual recognition. In this work we propose an object detection algorithm that not only considers object visual appearance, but also makes use of two kinds of context including scene contextual information and object relationships within a single image. Therefore, object detection is regarded as both a cognition problem and a reasoning problem when leveraging these structured information. Specifically, this paper formulates object detection as a problem of graph structure inference, where given an image the objects are treated as nodes in a graph and relationships between the objects are modeled as edges in such graph. To this end, we present a so-called Structure Inference Network (SIN), a detector that incorporates into a typical detection framework (e.g. Faster R-CNN) with a graphical model which aims to infer object state. Comprehensive experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection