Visual Relationship Detection Based on Guided Proposals and Semantic   Knowledge Distillation

Fran\c{c}ois Plesse; Alexandru Ginsca; Bertrand Delezoide,; Fran\c{c}oise Pr\^eteux

arXiv:1805.10802·cs.CV·May 29, 2018

Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Distillation

Fran\c{c}ois Plesse, Alexandru Ginsca, Bertrand Delezoide,, Fran\c{c}oise Pr\^eteux

PDF

TL;DR

This paper introduces a novel framework for visual relationship detection that leverages semantic knowledge and relevance estimation, significantly improving performance on the Visual Genome dataset by incorporating common sense and knowledge distillation.

Contribution

The proposed method uniquely combines semantic knowledge distillation and relevance estimation to enhance visual relationship detection accuracy.

Findings

01

68.5% relative gain on recall at 100 due to relevance estimation

02

32.7% improvement from knowledge distillation

03

Significant performance boost on Visual Genome dataset

Abstract

A thorough comprehension of image content demands a complex grasp of the interactions that may occur in the natural world. One of the key issues is to describe the visual relationships between objects. When dealing with real world data, capturing these very diverse interactions is a difficult problem. It can be alleviated by incorporating common sense in a network. For this, we propose a framework that makes use of semantic knowledge and estimates the relevance of object pairs during both training and test phases. Extracted from precomputed models and training annotations, this information is distilled into the neural network dedicated to this task. Using this approach, we observe a significant improvement on all classes of Visual Genome, a challenging visual relationship dataset. A 68.5% relative gain on the recall at 100 is directly related to the relevance estimate and a 32.7% gain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.