Fine-Grained Predicates Learning for Scene Graph Generation

Xinyu Lyu; Lianli Gao; Yuyu Guo; Zhou Zhao; Hao Huang and; Heng Tao Shen; Jingkuan Song

arXiv:2204.02597·cs.CV·April 11, 2022·1 cites

Fine-Grained Predicates Learning for Scene Graph Generation

Xinyu Lyu, Lianli Gao, Yuyu Guo, Zhou Zhao, Hao Huang and, Heng Tao Shen, Jingkuan Song

PDF

Open Access 1 Repo

TL;DR

This paper introduces Fine-Grained Predicates Learning (FGPL), a novel approach to improve scene graph generation by better distinguishing hard-to-separate predicates, significantly enhancing model performance on benchmark datasets.

Contribution

The paper proposes a new method with a Predicate Lattice and specialized loss functions to differentiate fine-grained predicates, outperforming existing models and state-of-the-art methods.

Findings

01

Boosts three benchmark models' mean recall by over 21%.

02

Outperforms state-of-the-art methods by up to 6.1% in mean recall.

03

Effectively distinguishes hard-to-separate predicates in scene graphs.

Abstract

The performance of current Scene Graph Generation models is severely hampered by some hard-to-distinguish predicates, e.g., "woman-on/standing on/walking on-beach" or "woman-near/looking at/in front of-child". While general SGG models are prone to predict head predicates and existing re-balancing strategies prefer tail categories, none of them can appropriately handle these hard-to-distinguish predicates. To tackle this issue, inspired by fine-grained image classification, which focuses on differentiating among hard-to-distinguish object classes, we propose a method named Fine-Grained Predicates Learning (FGPL) which aims at differentiating among hard-to-distinguish predicates for Scene Graph Generation task. Specifically, we first introduce a Predicate Lattice that helps SGG models to figure out fine-grained predicate pairs. Then, utilizing the Predicate Lattice, we propose a Category…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xinyulyu/fgpl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition