Distilled Visual and Robot Kinematics Embeddings for Metric Depth   Estimation in Monocular Scene Reconstruction

Ruofeng Wei; Bin Li; Hangjie Mo; Fangxun Zhong; Yonghao Long; Qi Dou,; Yun-Hui Liu; Dong Sun

arXiv:2211.14738·cs.RO·November 29, 2022

Distilled Visual and Robot Kinematics Embeddings for Metric Depth Estimation in Monocular Scene Reconstruction

Ruofeng Wei, Bin Li, Hangjie Mo, Fangxun Zhong, Yonghao Long, Qi Dou,, Yun-Hui Liu, Dong Sun

PDF

Open Access

TL;DR

This paper introduces a novel deep learning framework that combines robot kinematics and monocular endoscopy to accurately estimate metric depth and reconstruct 3D surgical scenes, overcoming limitations of traditional stereo methods.

Contribution

The authors develop a unified approach integrating robot kinematics, monocular images, and deep learning for precise metric depth estimation and 3D reconstruction in robotic surgery.

Findings

01

Achieved comparable depth estimation performance to stereo methods on public datasets.

02

Developed a Depth-driven Sliding Optimization (DDSO) algorithm for scale extraction.

03

Successfully reconstructed 3D surgical scenes from monocular endoscopic videos.

Abstract

Estimating precise metric depth and scene reconstruction from monocular endoscopy is a fundamental task for surgical navigation in robotic surgery. However, traditional stereo matching adopts binocular images to perceive the depth information, which is difficult to transfer to the soft robotics-based surgical systems due to the use of monocular endoscopy. In this paper, we present a novel framework that combines robot kinematics and monocular endoscope images with deep unsupervised learning into a single network for metric depth estimation and then achieve 3D reconstruction of complex anatomy. Specifically, we first obtain the relative depth maps of surgical scenes by leveraging a brightness-aware monocular depth estimation method. Then, the corresponding endoscope poses are computed based on non-linear optimization of geometric and photometric reprojection residuals. Afterwards, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Soft Robotics and Applications