Predicate Classification Using Optimal Transport Loss in Scene Graph Generation
Sorachi Kurita, Satoshi Oyama, Itsuki Noda

TL;DR
This paper introduces a novel predicate classification method in scene graph generation that employs optimal transport loss to better handle label imbalance, leveraging word similarity from pre-trained models, resulting in improved recall metrics.
Contribution
The study proposes using optimal transport loss with word similarity for predicate classification, addressing label imbalance in scene graph generation.
Findings
Outperforms existing methods in mean Recall@50 and 100
Improves recall for rare relationship labels
Demonstrates effectiveness of optimal transport loss in SGG
Abstract
In scene graph generation (SGG), learning with cross-entropy loss yields biased predictions owing to the severe imbalance in the distribution of the relationship labels in the dataset. Thus, this study proposes a method to generate scene graphs using optimal transport as a measure for comparing two probability distributions. We apply learning with the optimal transport loss, which reflects the similarity between the labels in terms of transportation cost, for predicate classification in SGG. In the proposed approach, the transportation cost of the optimal transport is defined using the similarity of words obtained from the pre-trained model. The experimental evaluation of the effectiveness demonstrates that the proposed method outperforms existing methods in terms of mean Recall@50 and 100. Furthermore, it improves the recall of the relationship labels scarcely available in the dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning and Data Classification · Text and Document Classification Technologies
