Visual Distant Supervision for Scene Graph Generation

Yuan Yao; Ao Zhang; Xu Han; Mengdi Li; Cornelius Weber; Zhiyuan Liu,; Stefan Wermter; Maosong Sun

arXiv:2103.15365·cs.CV·August 23, 2021

Visual Distant Supervision for Scene Graph Generation

Yuan Yao, Ao Zhang, Xu Han, Mengdi Li, Cornelius Weber, Zhiyuan Liu,, Stefan Wermter, Maosong Sun

PDF

1 Repo

TL;DR

This paper introduces a novel visual distant supervision paradigm for scene graph generation that leverages knowledge bases to automatically create large-scale labeled data, reducing reliance on human annotations and outperforming existing methods.

Contribution

The work proposes a new distant supervision framework for visual relation learning that automatically generates labeled data and effectively reduces noise, achieving superior performance.

Findings

01

Outperforms weakly and semi-supervised baselines

02

Achieves significant improvements over fully supervised models

03

Demonstrates effectiveness of knowledge-base aligned distant supervision

Abstract

Scene graph generation aims to identify objects and their relations in images, providing structured image representations that can facilitate numerous applications in computer vision. However, scene graph models usually require supervised learning on large quantities of labeled data with intensive human annotation. In this work, we propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data. The intuition is that by aligning commonsense knowledge bases and images, we can automatically create large-scale labeled data to provide distant supervision for visual relation learning. To alleviate the noise in distantly labeled data, we further propose a framework that iteratively estimates the probabilistic relation labels and eliminates the noisy ones. Comprehensive experimental results show that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thunlp/visualds
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.