Tensor Composition Net for Visual Relationship Prediction

Yuting Qiang; Yongxin Yang; Xueting Zhang; Yanwen Guo; Timothy M.; Hospedales

arXiv:2012.05473·cs.CV·February 10, 2022·1 cites

Tensor Composition Net for Visual Relationship Prediction

Yuting Qiang, Yongxin Yang, Xueting Zhang, Yanwen Guo, Timothy M., Hospedales

PDF

Open Access

TL;DR

The paper introduces a Tensor Composition Net that leverages low-rank tensor properties to improve visual relationship prediction, enabling the prediction of unseen relationships and enhancing image-retrieval tasks.

Contribution

A novel Tensor Composition Net utilizing tensor decomposition for structured visual relationship prediction, including unseen relationships, outperforming existing methods.

Findings

01

Outperforms Multi-Label and eXtreme Multi-label Classification methods.

02

Can predict unseen visual relationships.

03

Provides efficient relation-based image retrieval.

Abstract

We present a novel Tensor Composition Net (TCN) to predict visual relationships in images. Visual Relationship Prediction (VRP) provides a more challenging test of image understanding than conventional image tagging and is difficult to learn due to a large label-space and incomplete annotation. The key idea of our TCN is to exploit the low-rank property of the visual relationship tensor, so as to leverage correlations within and across objects and relations and make a structured prediction of all visual relationships in an image. To show the effectiveness of our model, we first empirically compare our model with Multi-Label Image Classification (MLIC) methods, eXtreme Multi-label Classification (XMC) methods, and VRD methods. We then show that thanks to our tensor (de)composition layer, our model can predict visual relationships which have not been seen in the training dataset. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning