Learning To Generate Scene Graph from Head to Tail
Chaofan Zheng, Xinyu Lyu, Yuyu Guo, Pengpeng Zeng, Jingkuan Song,, Lianli Gao

TL;DR
This paper introduces SGG-HT, a novel scene graph generation framework that addresses predicate imbalance by curriculum learning and semantic consistency, achieving state-of-the-art results on Visual Genome.
Contribution
The paper proposes a new framework with CRM and SCM to improve scene graph generation by balancing predicate learning and maintaining semantic accuracy.
Findings
Significantly alleviates bias towards head predicates.
Achieves state-of-the-art performance on Visual Genome.
Effectively maintains semantic consistency in generated graphs.
Abstract
Scene Graph Generation (SGG) represents objects and their interactions with a graph structure. Recently, many works are devoted to solving the imbalanced problem in SGG. However, underestimating the head predicates in the whole training process, they wreck the features of head predicates that provide general features for tail ones. Besides, assigning excessive attention to the tail predicates leads to semantic deviation. Based on this, we propose a novel SGG framework, learning to generate scene graphs from Head to Tail (SGG-HT), containing Curriculum Re-weight Mechanism (CRM) and Semantic Context Module (SCM). CRM learns head/easy samples firstly for robust features of head predicates and then gradually focuses on tail/hard ones. SCM is proposed to relieve semantic deviation by ensuring the semantic consistency between the generated scene graph and the ground truth in global and local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
