LEGO: Learning Edge with Geometry all at Once by Watching Videos

Zhenheng Yang; Peng Wang; Yang Wang; Wei Xu; Ram Nevatia

arXiv:1803.05648·cs.CV·March 28, 2018·21 cites

LEGO: Learning Edge with Geometry all at Once by Watching Videos

Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia

PDF

Open Access 1 Repo

TL;DR

This paper introduces LEGO, an unsupervised deep learning framework that jointly estimates 3D geometry and edges from videos, significantly improving accuracy by incorporating a novel 3D-ASAP prior that enforces planar surface consistency.

Contribution

The paper presents a new unsupervised method that simultaneously learns edges, depth, and normals using a 3D-ASAP prior, enhancing geometric detail accuracy in 3D scene reconstruction.

Findings

01

Outperforms state-of-the-art on KITTI for depth and normal estimation.

02

Achieves superior edge detection accuracy on CityScapes.

03

Demonstrates consistent improvement across all evaluated tasks.

Abstract

Learning to estimate 3D geometry in a single image by watching unlabeled videos via deep convolutional network is attracting significant attention. In this paper, we introduce a "3D as-smooth-as-possible (3D-ASAP)" prior inside the pipeline, which enables joint estimation of edges and 3D scene, yielding results with significant improvement in accuracy for fine detailed structures. Specifically, we define the 3D-ASAP prior by requiring that any two points recovered in 3D from an image should lie on an existing planar surface if no other cues provided. We design an unsupervised framework that Learns Edges and Geometry (depth, normal) all at Once (LEGO). The predicted edges are embedded into depth and surface normal smoothness terms, where pixels without edges in-between are constrained to satisfy the prior. In our framework, the predicted depths, normals and edges are forced to be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhenheny/LEGO
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques