DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras
Zachary Teed, Jia Deng

TL;DR
DROID-SLAM is a deep learning-based SLAM system that iteratively updates camera pose and depth, achieving high accuracy and robustness across monocular, stereo, and RGB-D videos.
Contribution
It introduces a novel recurrent iterative update mechanism with a Dense Bundle Adjustment layer for improved SLAM performance.
Findings
Significantly outperforms prior SLAM methods in accuracy.
Exhibits fewer catastrophic failures, demonstrating robustness.
Can leverage stereo and RGB-D data even when trained on monocular videos.
Abstract
We introduce DROID-SLAM, a new deep learning based SLAM system. DROID-SLAM consists of recurrent iterative updates of camera pose and pixelwise depth through a Dense Bundle Adjustment layer. DROID-SLAM is accurate, achieving large improvements over prior work, and robust, suffering from substantially fewer catastrophic failures. Despite training on monocular video, it can leverage stereo or RGB-D video to achieve improved performance at test time. The URL to our open source code is https://github.com/princeton-vl/DROID-SLAM.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Optical measurement and interference techniques
MethodsDROID-SLAM
