Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship   Features

Xu Yang; Hanwang Zhang; Jianfei Cai

arXiv:1808.00171·cs.CV·August 2, 2018

Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features

Xu Yang, Hanwang Zhang, Jianfei Cai

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel pre-training strategy called Shuffle-Then-Assemble to learn object-agnostic visual features, improving the generalization of visual relationship models to rare or unseen object pairs.

Contribution

It proposes a new pre-training method that reduces object bias in visual relationship models by recovering object pairs from unpaired object domains.

Findings

01

Pre-trained features improve relationship model performance.

02

Outperforms state-of-the-art relationship models.

03

Enhances generalization to rare or unseen object pairs.

Abstract

Due to the fact that it is prohibitively expensive to completely annotate visual relationships, i.e., the (obj1, rel, obj2) triplets, relationship models are inevitably biased to object classes of limited pairwise patterns, leading to poor generalization to rare or unseen object combinations. Therefore, we are interested in learning object-agnostic visual features for more generalizable relationship models. By "agnostic", we mean that the feature is less likely biased to the classes of paired objects. To alleviate the bias, we propose a novel \texttt{Shuffle-Then-Assemble} pre-training strategy. First, we discard all the triplet relationship annotations in an image, leaving two unpaired object domains without obj1-obj2 alignment. Then, our feature learning is to recover possible obj1-obj2 pairs. In particular, we design a cycle of residual transformations between the two domains, to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yangxuntu/vrd
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition