Weakly-supervised learning of visual relations

Julia Peyre; Ivan Laptev; Cordelia Schmid; Josef Sivic

arXiv:1707.09472·cs.CV·August 1, 2017·19 cites

Weakly-supervised learning of visual relations

Julia Peyre, Ivan Laptev, Cordelia Schmid, Josef Sivic

PDF

Open Access 1 Video

TL;DR

This paper presents a weakly-supervised approach for learning visual object relations using discriminative clustering, novel features, and a new dataset, achieving state-of-the-art results especially in zero-shot relation recognition.

Contribution

It introduces a weakly-supervised discriminative clustering model, new visual features for relation modeling, and a challenging dataset for evaluating visual relation retrieval.

Findings

01

State-of-the-art performance on visual relation dataset

02

Significant improvement in zero-shot relation recognition

03

Effective relation modeling with weak supervision

Abstract

This paper introduces a novel approach for modeling visual relations between pairs of objects. We call relation a triplet of the form (subject, predicate, object) where the predicate is typically a preposition (eg. 'under', 'in front of') or a verb ('hold', 'ride') that links a pair of objects (subject, object). Learning such relations is challenging as the objects have different spatial configurations and appearances depending on the relation in which they occur. Another major challenge comes from the difficulty to get annotations, especially at box-level, for all possible triplets, which makes both learning and evaluation difficult. The contributions of this paper are threefold. First, we design strong yet flexible visual features that encode the appearance and spatial configuration for pairs of objects. Second, we propose a weakly-supervised discriminative clustering model to learn…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Weakly-supervised learning of visual relations· youtube

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning