Unbiased Scene Graph Generation using Predicate Similarities
Misaki Ohashi, Yusuke Matsui

TL;DR
This paper introduces a novel approach to unbiased scene graph generation by leveraging predicate similarities and transfer learning, significantly improving tail predicate classification in visual relationship tasks.
Contribution
It proposes a new classification scheme based on predicate similarities and incorporates transfer learning to better handle infrequent predicates in scene graph generation.
Findings
Improved performance on tail predicates in SGCls/SGDet tasks.
Combining the method with existing debiasing approaches enhances results.
Overall performance still lags behind current state-of-the-art methods.
Abstract
Scene Graphs are widely applied in computer vision as a graphical representation of relationships between objects shown in images. However, these applications have not yet reached a practical stage of development owing to biased training caused by long-tailed predicate distributions. In recent years, many studies have tackled this problem. In contrast, relatively few works have considered predicate similarities as a unique dataset feature which also leads to the biased prediction. Due to the feature, infrequent predicates (e.g., parked on, covered in) are easily misclassified as closely-related frequent predicates (e.g., on, in). Utilizing predicate similarities, we propose a new classification scheme that branches the process to several fine-grained classifiers for similar predicate groups. The classifiers aim to capture the differences among similar predicates in detail. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
