Knowledge-Embedded Routing Network for Scene Graph Generation
Tianshui Chen, Weihao Yu, Riquan Chen, Liang Lin

TL;DR
This paper introduces a Knowledge-Embedded Routing Network that leverages statistical correlations in scene graphs to improve relationship prediction, especially for less frequent interactions, demonstrating superior performance on the Visual Genome dataset.
Contribution
The paper proposes a novel network that embeds statistical relationship knowledge via a structured graph and routing mechanism, enhancing scene graph generation accuracy.
Findings
Outperforms state-of-the-art methods on Visual Genome dataset
Effectively addresses unbalanced relationship distribution
Utilizes a structured knowledge graph for better relationship inference
Abstract
To understand a scene in depth not only involves locating/recognizing individual objects, but also requires to infer the relationships and interactions among them. However, since the distribution of real-world relationships is seriously unbalanced, existing methods perform quite poorly for the less frequent relationships. In this work, we find that the statistical correlations between object pairs and their relationships can effectively regularize semantic space and make prediction less ambiguous, and thus well address the unbalanced distribution issue. To achieve this, we incorporate these statistical correlations into deep neural networks to facilitate scene graph generation by developing a Knowledge-Embedded Routing Network. More specifically, we show that the statistical correlations between objects appearing in images and their relationships, can be explicitly represented by a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
