Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Distillation
Fran\c{c}ois Plesse, Alexandru Ginsca, Bertrand Delezoide,, Fran\c{c}oise Pr\^eteux

TL;DR
This paper introduces a novel framework for visual relationship detection that leverages semantic knowledge and relevance estimation, significantly improving performance on the Visual Genome dataset by incorporating common sense and knowledge distillation.
Contribution
The proposed method uniquely combines semantic knowledge distillation and relevance estimation to enhance visual relationship detection accuracy.
Findings
68.5% relative gain on recall at 100 due to relevance estimation
32.7% improvement from knowledge distillation
Significant performance boost on Visual Genome dataset
Abstract
A thorough comprehension of image content demands a complex grasp of the interactions that may occur in the natural world. One of the key issues is to describe the visual relationships between objects. When dealing with real world data, capturing these very diverse interactions is a difficult problem. It can be alleviated by incorporating common sense in a network. For this, we propose a framework that makes use of semantic knowledge and estimates the relevance of object pairs during both training and test phases. Extracted from precomputed models and training annotations, this information is distilled into the neural network dedicated to this task. Using this approach, we observe a significant improvement on all classes of Visual Genome, a challenging visual relationship dataset. A 68.5% relative gain on the recall at 100 is directly related to the relevance estimate and a 32.7% gain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
