Pixels to Graphs by Associative Embedding

Alejandro Newell; Jia Deng

arXiv:1706.07365·cs.CV·March 28, 2018·100 cites

Pixels to Graphs by Associative Embedding

Alejandro Newell, Jia Deng

PDF

Open Access 3 Repos

TL;DR

This paper introduces an end-to-end convolutional neural network that generates scene graphs from images using associative embeddings, achieving state-of-the-art results on the Visual Genome dataset.

Contribution

It presents a novel single-stage method for scene graph generation from images using associative embeddings, simplifying the process and improving performance.

Findings

01

Achieved state-of-the-art performance on Visual Genome dataset.

02

Successfully identified and assembled scene graph elements end-to-end.

03

Demonstrated the effectiveness of associative embeddings in scene understanding.

Abstract

Graphs are a useful abstraction of image content. Not only can graphs represent details about individual objects in a scene but they can capture the interactions between pairs of objects. We present a method for training a convolutional neural network such that it takes in an input image and produces a full graph definition. This is done end-to-end in a single stage with the use of associative embeddings. The network learns to simultaneously identify all of the elements that make up a graph and piece them together. We benchmark on the Visual Genome dataset, and demonstrate state-of-the-art performance on the challenging task of scene graph generation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Human Pose and Action Recognition