Image-Graph-Image Translation via Auto-Encoding
Chenyang Lu, Gijs Dubbelman

TL;DR
This paper introduces a self-supervised convolutional neural network that translates images into graph representations without external annotations, enabling scene understanding through auto-encoding of objects and relationships.
Contribution
It is the first to propose a fully-differentiable auto-encoder for image-to-graph translation in a self-supervised manner, reducing reliance on manual annotations.
Findings
Successfully encodes simple line drawings into graphs
Achieves comparable F1 scores to supervised methods
Provides future directions for complex imagery
Abstract
This work presents the first convolutional neural network that learns an image-to-graph translation task without needing external supervision. Obtaining graph representations of image content, where objects are represented as nodes and their relationships as edges, is an important task in scene understanding. Current approaches follow a fully-supervised approach thereby requiring meticulous annotations. To overcome this, we are the first to present a self-supervised approach based on a fully-differentiable auto-encoder in which the bottleneck encodes the graph's nodes and edges. This self-supervised approach can currently encode simple line drawings into graphs and obtains comparable results to a fully-supervised baseline in terms of F1 score on triplet matching. Besides these promising results, we provide several directions for future research on how our approach can be extended to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
