# Understanding urban landuse from the above and ground perspectives: a   deep learning, multimodal solution

**Authors:** Shivangi Srivastava, John E. Vargas-Mu\~noz, Devis Tuia

arXiv: 1905.01752 · 2019-05-07

## TL;DR

This paper presents a deep learning multimodal approach combining overhead and ground-based imagery to automate urban landuse mapping, improving accuracy and scalability for urban planning applications.

## Contribution

The study introduces an end-to-end trainable multimodal CNN that integrates Google Maps and Street View images for landuse classification, demonstrating superior accuracy and generalization across cities.

## Key findings

- Multimodal model outperforms single-modality methods in accuracy.
- Model generalizes well to different cities beyond training area.
- Approach is scalable using widely available data sources.

## Abstract

Landuse characterization is important for urban planning. It is traditionally performed with field surveys or manual photo interpretation, two practices that are time-consuming and labor-intensive. Therefore, we aim to automate landuse mapping at the urban-object level with a deep learning approach based on data from multiple sources (or modalities). We consider two image modalities: overhead imagery from Google Maps and ensembles of ground-based pictures (side-views) per urban-object from Google Street View (GSV). These modalities bring complementary visual information pertaining to the urban-objects. We propose an end-to-end trainable model, which uses OpenStreetMap annotations as labels. The model can accommodate a variable number of GSV pictures for the ground-based branch and can also function in the absence of ground pictures at prediction time. We test the effectiveness of our model over the area of \^Ile-de-France, France, and test its generalization abilities on a set of urban-objects from the city of Nantes, France. Our proposed multimodal Convolutional Neural Network achieves considerably higher accuracies than methods that use a single image modality, making it suitable for automatic landuse map updates. Additionally, our approach could be easily scaled to multiple cities, because it is based on data sources available for many cities worldwide.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.01752/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1905.01752/full.md

## References

52 references — full list in the complete paper: https://tomesphere.com/paper/1905.01752/full.md

---
Source: https://tomesphere.com/paper/1905.01752