MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments
Manolis Savva, Angel X. Chang, Alexey Dosovitskiy, Thomas Funkhouser,, Vladlen Koltun

TL;DR
MINOS is an open-source simulator for developing and benchmarking multisensory navigation models in complex indoor environments, revealing the challenges of deep reinforcement learning and the benefits of multimodal sensors.
Contribution
MINOS provides a flexible, large-scale simulation platform for multisensory indoor navigation research, enabling benchmarking and analysis of sensor modalities in realistic settings.
Findings
Deep reinforcement learning struggles in large realistic environments.
Multimodal sensors improve navigation in cluttered scenes.
MINOS is publicly available for research use.
Abstract
We present MINOS, a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environments. The simulator leverages large datasets of complex 3D environments and supports flexible configuration of multimodal sensor suites. We use MINOS to benchmark deep-learning-based navigation methods, to analyze the influence of environmental complexity on navigation performance, and to carry out a controlled study of multimodality in sensorimotor learning. The experiments show that current deep reinforcement learning approaches fail in large realistic environments. The experiments also indicate that multimodality is beneficial in learning to navigate cluttered scenes. MINOS is released open-source to the research community at http://minosworld.org . A video that shows MINOS can be found at https://youtu.be/c0mL9K64q84
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Tactile and Sensory Interactions · Multimodal Machine Learning Applications
