Learning-Based Depth and Pose Estimation for Monocular Endoscope with Loss Generalization
Aji Resindra Widya, Yusuke Monno, Masatoshi Okutomi, Sho Suzuki,, Takuji Gotoda, Kenji Miki

TL;DR
This paper introduces a deep learning approach for estimating depth and pose in monocular gastroendoscopy, utilizing real data and a novel loss function to improve navigation and localization within the stomach.
Contribution
It presents a new supervised training method with a generalized photometric loss that enhances depth and pose estimation accuracy for endoscopic navigation.
Findings
The generalized loss outperforms existing supervision losses.
Real data training improves generalization to actual endoscopy images.
The approach aids in better endoscope navigation and lesion localization.
Abstract
Gastroendoscopy has been a clinical standard for diagnosing and treating conditions that affect a part of a patient's digestive system, such as the stomach. Despite the fact that gastroendoscopy has a lot of advantages for patients, there exist some challenges for practitioners, such as the lack of 3D perception, including the depth and the endoscope pose information. Such challenges make navigating the endoscope and localizing any found lesion in a digestive tract difficult. To tackle these problems, deep learning-based approaches have been proposed to provide monocular gastroendoscopy with additional yet important depth and pose information. In this paper, we propose a novel supervised approach to train depth and pose estimation networks using consecutive endoscopy images to assist the endoscope navigation in the stomach. We firstly generate real depth and pose training data using our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Surgical Simulation and Training · Advanced Image and Video Retrieval Techniques
