# Habitat: A Platform for Embodied AI Research

**Authors:** Manolis Savva, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik, Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra, Malik, Devi Parikh, Dhruv Batra

arXiv: 1904.01201 · 2019-11-26

## TL;DR

Habitat is an open-source platform combining a high-performance 3D simulator and a modular API, enabling scalable embodied AI research and providing new insights into navigation algorithms and sensor generalization.

## Contribution

The paper introduces Habitat, a scalable platform with a fast simulator and flexible API, facilitating advanced embodied AI experiments and new scientific findings.

## Key findings

- Learning approaches outperform SLAM when scaled with more experience.
- Depth sensors enable better cross-dataset generalization.
- Habitat accelerates embodied AI research with high FPS and modular design.

## Abstract

We present Habitat, a platform for research in embodied artificial intelligence (AI). Habitat enables training embodied agents (virtual robots) in highly efficient photorealistic 3D simulation. Specifically, Habitat consists of: (i) Habitat-Sim: a flexible, high-performance 3D simulator with configurable agents, sensors, and generic 3D dataset handling. Habitat-Sim is fast -- when rendering a scene from Matterport3D, it achieves several thousand frames per second (fps) running single-threaded, and can reach over 10,000 fps multi-process on a single GPU. (ii) Habitat-API: a modular high-level library for end-to-end development of embodied AI algorithms -- defining tasks (e.g., navigation, instruction following, question answering), configuring, training, and benchmarking embodied agents.   These large-scale engineering contributions enable us to answer scientific questions requiring experiments that were till now impracticable or 'merely' impractical. Specifically, in the context of point-goal navigation: (1) we revisit the comparison between learning and SLAM approaches from two recent works and find evidence for the opposite conclusion -- that learning outperforms SLAM if scaled to an order of magnitude more experience than previous investigations, and (2) we conduct the first cross-dataset generalization experiments {train, test} x {Matterport3D, Gibson} for multiple sensors {blind, RGB, RGBD, D} and find that only agents with depth (D) sensors generalize across datasets. We hope that our open-source platform and these findings will advance research in embodied AI.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.01201/full.md

## Figures

41 figures with captions in the complete paper: https://tomesphere.com/paper/1904.01201/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1904.01201/full.md

---
Source: https://tomesphere.com/paper/1904.01201