Multi-task learning with cross-task consistency for improved depth   estimation in colonoscopy

Pedro Esteban Chavarrias Solano; Andrew Bulpitt; Venkataraman; Subramanian; Sharib Ali

arXiv:2311.18664·cs.CV·December 1, 2023·1 cites

Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy

Pedro Esteban Chavarrias Solano, Andrew Bulpitt, Venkataraman, Subramanian, Sharib Ali

PDF

Open Access

TL;DR

This paper introduces a multi-task learning framework with cross-task consistency for more accurate depth estimation in colonoscopy videos, addressing challenges posed by complex organ topology and monocular imaging.

Contribution

The work presents a novel multi-task learning model with shared encoder, attention-enhanced depth decoder, and surface normal decoder, incorporating cross-task consistency loss for improved depth estimation in colonoscopy.

Findings

01

14.17% reduction in relative error

02

10.4% improvement in $oldsymbol{ extit{ extbf{ extdelta}}}_1$ accuracy

03

First benchmark of state-of-the-art methods on C3VD dataset

Abstract

Colonoscopy screening is the gold standard procedure for assessing abnormalities in the colon and rectum, such as ulcers and cancerous polyps. Measuring the abnormal mucosal area and its 3D reconstruction can help quantify the surveyed area and objectively evaluate disease burden. However, due to the complex topology of these organs and variable physical conditions, for example, lighting, large homogeneous texture, and image modality estimating distance from the camera aka depth) is highly challenging. Moreover, most colonoscopic video acquisition is monocular, making the depth estimation a non-trivial problem. While methods in computer vision for depth estimation have been proposed and advanced on natural scene datasets, the efficacy of these techniques has not been widely quantified on colonoscopy datasets. As the colonic mucosa has several low-texture regions that are not well…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image and Video Retrieval Techniques