# A fully end-to-end deep learning approach for real-time simultaneous 3D   reconstruction and material recognition

**Authors:** Cheng Zhao, Li Sun, Rustam Stolkin

arXiv: 1703.04699 · 2018-07-17

## TL;DR

This paper introduces a novel end-to-end deep learning system capable of real-time 3D reconstruction and material recognition, significantly advancing robotic scene understanding without relying on traditional post-processing methods.

## Contribution

It presents the first real-time, fully end-to-end deep learning approach for simultaneous 3D reconstruction and material recognition, eliminating the need for hand-crafted features and CRF post-processing.

## Key findings

- Achieves real-time performance at around 10Hz on a standard GPU.
- Successfully recognizes 23 different materials in real-world scenes.
- First system to combine 3D reconstruction with material recognition in real-time.

## Abstract

This paper addresses the problem of simultaneous 3D reconstruction and material recognition and segmentation. Enabling robots to recognise different materials (concrete, metal etc.) in a scene is important for many tasks, e.g. robotic interventions in nuclear decommissioning. Previous work on 3D semantic reconstruction has predominantly focused on recognition of everyday domestic objects (tables, chairs etc.), whereas previous work on material recognition has largely been confined to single 2D images without any 3D reconstruction. Meanwhile, most 3D semantic reconstruction methods rely on computationally expensive post-processing, using Fully-Connected Conditional Random Fields (CRFs), to achieve consistent segmentations. In contrast, we propose a deep learning method which performs 3D reconstruction while simultaneously recognising different types of materials and labelling them at the pixel level. Unlike previous methods, we propose a fully end-to-end approach, which does not require hand-crafted features or CRF post-processing. Instead, we use only learned features, and the CRF segmentation constraints are incorporated inside the fully end-to-end learned system. We present the results of experiments, in which we trained our system to perform real-time 3D semantic reconstruction for 23 different materials in a real-world application. The run-time performance of the system can be boosted to around 10Hz, using a conventional GPU, which is enough to achieve real-time semantic reconstruction using a 30fps RGB-D camera. To the best of our knowledge, this work is the first real-time end-to-end system for simultaneous 3D reconstruction and material recognition.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.04699/full.md

## Figures

37 figures with captions in the complete paper: https://tomesphere.com/paper/1703.04699/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1703.04699/full.md

---
Source: https://tomesphere.com/paper/1703.04699