EndoSLAM Dataset and An Unsupervised Monocular Visual Odometry and Depth   Estimation Approach for Endoscopic Videos: Endo-SfMLearner

Kutsev Bengisu Ozyoruk; Guliz Irem Gokceler; Gulfize Coskun; Kagan; Incetan; Yasin Almalioglu; Faisal Mahmood; Eva Curto; Luis Perdigoto; Marina; Oliveira; Hasan Sahin; Helder Araujo; Henrique Alexandrino; Nicholas J. Durr,; Hunter B. Gilbert; and Mehmet Turan

arXiv:2006.16670·cs.CV·October 2, 2020·5 cites

EndoSLAM Dataset and An Unsupervised Monocular Visual Odometry and Depth Estimation Approach for Endoscopic Videos: Endo-SfMLearner

Kutsev Bengisu Ozyoruk, Guliz Irem Gokceler, Gulfize Coskun, Kagan, Incetan, Yasin Almalioglu, Faisal Mahmood, Eva Curto, Luis Perdigoto, Marina, Oliveira, Hasan Sahin, Helder Araujo, Henrique Alexandrino, Nicholas J. Durr,, Hunter B. Gilbert, and Mehmet Turan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a comprehensive endoscopic SLAM dataset with diverse data types and ground truth, and proposes Endo-SfMLearner, an unsupervised deep learning method for monocular depth and pose estimation in endoscopic videos.

Contribution

The paper provides a new extensive dataset for endoscopic SLAM with ground truth and synthetic data, and develops Endo-SfMLearner, a novel unsupervised approach utilizing residual networks and attention for depth and pose estimation.

Findings

01

Endo-SfMLearner outperforms existing methods on the dataset.

02

The dataset enables effective benchmarking of endoscopic SLAM algorithms.

03

Synthetic data facilitates transfer learning to real endoscopic videos.

Abstract

Deep learning techniques hold promise to develop dense topography reconstruction and pose estimation methods for endoscopic videos. However, currently available datasets do not support effective quantitative benchmarking. In this paper, we introduce a comprehensive endoscopic SLAM dataset consisting of 3D point cloud data for six porcine organs, capsule and standard endoscopy recordings as well as synthetically generated data. A Panda robotic arm, two commercially available capsule endoscopes, two conventional endoscopes with different camera properties, and two high precision 3D scanners were employed to collect data from 8 ex-vivo porcine gastrointestinal (GI)-tract organs. In total, 35 sub-datasets are provided with 6D pose ground truth for the ex-vivo part: 18 sub-dataset for colon, 12 sub-datasets for stomach and 5 sub-datasets for small intestine, while four of these contain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CapsuleEndoscope/EndoSLAM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging

MethodsConvolution · Sigmoid Activation · Average Pooling · Max Pooling · Communication--Guide||How Do I Communicate to Expedia?