RA-SGG: Retrieval-Augmented Scene Graph Generation Framework via Multi-Prototype Learning
Kanghoon Yoon, Kibum Kim, Jaehyung Jeon, Yeonjun In, Donghyun Kim,, Chanyoung Park

TL;DR
This paper introduces RA-SGG, a retrieval-augmented framework for scene graph generation that addresses long-tailed predicate distribution and semantic ambiguity by leveraging multi-label classification and multi-prototype learning, improving accuracy on benchmark datasets.
Contribution
The paper proposes a novel retrieval-augmented multi-label learning approach for scene graph generation, effectively mitigating bias and semantic ambiguity issues.
Findings
Outperforms state-of-the-art by up to 3.6% on VG and 5.9% on GQA datasets.
Effectively alleviates bias caused by long-tailed predicate distribution.
Enhances predicate prediction accuracy through multi-label augmentation.
Abstract
Scene Graph Generation (SGG) research has suffered from two fundamental challenges: the long-tailed predicate distribution and semantic ambiguity between predicates. These challenges lead to a bias towards head predicates in SGG models, favoring dominant general predicates while overlooking fine-grained predicates. In this paper, we address the challenges of SGG by framing it as multi-label classification problem with partial annotation, where relevant labels of fine-grained predicates are missing. Under the new frame, we propose Retrieval-Augmented Scene Graph Generation (RA-SGG), which identifies potential instances to be multi-labeled and enriches the single-label with multi-labels that are semantically similar to the original label by retrieving relevant samples from our established memory bank. Based on augmented relations (i.e., discovered multi-labels), we apply multi-prototype…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Advanced Graph Neural Networks
