TorontoCity: Seeing the World with a Million Eyes

Shenlong Wang; Min Bai; Gellert Mattyus; Hang Chu; Wenjie Luo; Bin; Yang; Justin Liang; Joel Cheverie; Sanja Fidler; Raquel Urtasun

arXiv:1612.00423·cs.CV·December 2, 2016·19 cites

TorontoCity: Seeing the World with a Million Eyes

Shenlong Wang, Min Bai, Gellert Mattyus, Hang Chu, Wenjie Luo, Bin, Yang, Justin Liang, Joel Cheverie, Sanja Fidler, Raquel Urtasun

PDF

Open Access 1 Video

TL;DR

The paper introduces the TorontoCity benchmark dataset covering a large urban area with diverse data sources, and develops algorithms for aligning data with maps to facilitate various urban scene understanding tasks.

Contribution

It presents a comprehensive large-scale urban dataset with novel alignment algorithms and multiple challenging tasks for scene understanding.

Findings

01

Most tasks remain difficult for current CNNs

02

High-precision map alignment reduces labeling effort

03

Benchmark covers extensive urban environment

Abstract

In this paper we introduce the TorontoCity benchmark, which covers the full greater Toronto area (GTA) with 712.5 $k m^{2}$ of land, 8439 $k m$ of road and around 400,000 buildings. Our benchmark provides different perspectives of the world captured from airplanes, drones and cars driving around the city. Manually labeling such a large scale dataset is infeasible. Instead, we propose to utilize different sources of high-precision maps to create our ground truth. Towards this goal, we develop algorithms that allow us to align all data sources with the maps while requiring minimal human supervision. We have designed a wide variety of tasks including building height estimation (reconstruction), road centerline and curb extraction, building instance segmentation, building contour extraction (reorganization), semantic labeling and scene type classification (recognition). Our pilot study shows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

TorontoCity: Seeing the World with a Million Eyes· youtube

Taxonomy

TopicsAutomated Road and Building Extraction · Remote Sensing and LiDAR Applications · Video Surveillance and Tracking Methods