Self-Supervised Road Layout Parsing with Graph Auto-Encoding
Chenyang Lu, Gijs Dubbelman

TL;DR
This paper introduces a self-supervised neural network that converts bird's-eye-view road layout images into human-interpretable graphs, improving topological understanding without manual annotations.
Contribution
It presents a novel image-graph-image auto-encoder trained with synthetic data, enabling stable, self-supervised learning of road topology from real-world data.
Findings
Achieves comparable performance to fully-supervised methods.
Uses synthetic data to enhance real-world road layout understanding.
Operates without manual annotations through self-supervised learning.
Abstract
Aiming for higher-level scene understanding, this work presents a neural network approach that takes a road-layout map in bird's-eye-view as input, and predicts a human-interpretable graph that represents the road's topological layout. Our approach elevates the understanding of road layouts from pixel level to the level of graphs. To achieve this goal, an image-graph-image auto-encoder is utilized. The network is designed to learn to regress the graph representation at its auto-encoder bottleneck. This learning is self-supervised by an image reconstruction loss, without needing any external manual annotations. We create a synthetic dataset containing common road layout patterns and use it for training of the auto-encoder in addition to the real-world Argoverse dataset. By using this additional synthetic dataset, which conceptually captures human knowledge of road layouts and makes this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutomated Road and Building Extraction · Infrastructure Maintenance and Monitoring
