Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization
Lahav Lipson, Jia Deng

TL;DR
This paper presents a novel Multi-Session SLAM system that integrates differentiable wide-baseline pose optimization with optical flow prediction, enabling accurate, robust, and end-to-end trainable multi-video camera tracking.
Contribution
It introduces a differentiable solver for wide-baseline pose estimation and integrates it into an end-to-end trainable system for multi-session SLAM.
Findings
System can connect disjoint sequences effectively
Achieves high accuracy in camera pose estimation
Robust to catastrophic failures in challenging scenarios
Abstract
We introduce a new system for Multi-Session SLAM, which tracks camera motion across multiple disjoint videos under a single global reference. Our approach couples the prediction of optical flow with solver layers to estimate camera pose. The backbone is trained end-to-end using a novel differentiable solver for wide-baseline two-view pose. The full system can connect disjoint sequences, perform visual odometry, and global optimization. Compared to existing approaches, our design is accurate and robust to catastrophic failures. Code is available at github.com/princeton-vl/MultiSlam_DiffPose
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Modular Robots and Swarm Intelligence · Robotics and Sensor-Based Localization
