Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy
Pedro Esteban Chavarrias Solano, Andrew Bulpitt, Venkataraman, Subramanian, Sharib Ali

TL;DR
This paper introduces a multi-task learning framework with cross-task consistency for more accurate depth estimation in colonoscopy videos, addressing challenges posed by complex organ topology and monocular imaging.
Contribution
The work presents a novel multi-task learning model with shared encoder, attention-enhanced depth decoder, and surface normal decoder, incorporating cross-task consistency loss for improved depth estimation in colonoscopy.
Findings
14.17% reduction in relative error
10.4% improvement in $oldsymbol{ extit{ extbf{ extdelta}}}_1$ accuracy
First benchmark of state-of-the-art methods on C3VD dataset
Abstract
Colonoscopy screening is the gold standard procedure for assessing abnormalities in the colon and rectum, such as ulcers and cancerous polyps. Measuring the abnormal mucosal area and its 3D reconstruction can help quantify the surveyed area and objectively evaluate disease burden. However, due to the complex topology of these organs and variable physical conditions, for example, lighting, large homogeneous texture, and image modality estimating distance from the camera aka depth) is highly challenging. Moreover, most colonoscopic video acquisition is monocular, making the depth estimation a non-trivial problem. While methods in computer vision for depth estimation have been proposed and advanced on natural scene datasets, the efficacy of these techniques has not been widely quantified on colonoscopy datasets. As the colonic mucosa has several low-texture regions that are not well…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image and Video Retrieval Techniques
