Prototype-based Embedding Network for Scene Graph Generation
Chaofan Zheng, Xinyu Lyu, Lianli Gao, Bo Dai, and Jingkuan Song

TL;DR
This paper introduces PE-Net, a prototype-based embedding network that improves scene graph generation by modeling entities and predicates with class-wise prototypes, leading to more robust relation prediction.
Contribution
It proposes a novel prototype-based embedding approach and associated learning and regularization techniques to enhance relation recognition in scene graph generation.
Findings
Achieves state-of-the-art performance on Visual Genome dataset.
Outperforms existing methods on Open Images dataset.
Demonstrates robustness in relation prediction tasks.
Abstract
Current Scene Graph Generation (SGG) methods explore contextual information to predict relationships among entity pairs. However, due to the diverse visual appearance of numerous possible subject-object combinations, there is a large intra-class variation within each predicate category, e.g., "man-eating-pizza, giraffe-eating-leaf", and the severe inter-class similarity between different classes, e.g., "man-holding-plate, man-eating-pizza", in model's latent space. The above challenges prevent current SGG methods from acquiring robust features for reliable relation prediction. In this paper, we claim that the predicate's category-inherent semantics can serve as class-wise prototypes in the semantic space for relieving the challenges. To the end, we propose the Prototype-based Embedding Network (PE-Net), which models entities/predicates with prototype-aligned compact and distinctive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning
