Extraction of Key-frames of Endoscopic Videos by using Depth Information
Pradipta Sasmal, Avinash Paul, M.K. Bhuyan, and Yuji Iwahori

TL;DR
This paper introduces a deep learning-based monocular depth estimation method for selecting the most informative key frames in endoscopic videos, aiding clinical diagnosis and potential 3D reconstruction of polyps.
Contribution
It proposes a novel key-frame selection strategy using depth information, image moments, edges, and key-points, addressing the lack of ground truth depth maps through transfer learning.
Findings
Effective selection of key frames using depth cues.
Potential for improved 3D reconstruction of polyps.
Enhanced localization of polyps with depth maps.
Abstract
A deep learning-based monocular depth estimation (MDE) technique is proposed for selection of most informative frames (key frames) of an endoscopic video. In most of the cases, ground truth depth maps of polyps are not readily available and that is why the transfer learning approach is adopted in our method. An endoscopic modalities generally capture thousands of frames. In this scenario, it is quite important to discard low-quality and clinically irrelevant frames of an endoscopic video while the most informative frames should be retained for clinical diagnosis. In this view, a key-frame selection strategy is proposed by utilizing the depth information of polyps. In our method, image moment, edge magnitude, and key-points are considered for adaptively selecting the key frames. One important application of our proposed method could be the 3D reconstruction of polyps with the help of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Advanced Image Processing Techniques
