Monocular Depth Guided Occlusion-Aware Disparity Refinement via Semi-supervised Learning in Laparoscopic Images
Ziteng Liu, Dongdong He, Chenghong Zhang, Wenpeng Gao, Yili Fu

TL;DR
This paper introduces DGORNet, a semi-supervised learning approach that refines disparity maps in laparoscopic images by utilizing monocular depth, spatial context, and temporal information, significantly improving accuracy in occluded and texture-less regions.
Contribution
The study presents a novel disparity refinement network incorporating monocular depth guidance, position embedding, and optical flow-based loss, addressing occlusion and data scarcity in surgical stereo images.
Findings
Outperforms state-of-the-art in EPE and RMSE metrics
Effective in occlusion and texture-less regions
Ablation confirms importance of PE and OFDLoss
Abstract
Occlusion and the scarcity of labeled surgical data are significant challenges in disparity estimation for stereo laparoscopic images. To address these issues, this study proposes a Depth Guided Occlusion-Aware Disparity Refinement Network (DGORNet), which refines disparity maps by leveraging monocular depth information unaffected by occlusion. A Position Embedding (PE) module is introduced to provide explicit spatial context, enhancing the network's ability to localize and refine features. Furthermore, we introduce an Optical Flow Difference Loss (OFDLoss) for unlabeled data, leveraging temporal continuity across video frames to improve robustness in dynamic surgical scenes. Experiments on the SCARED dataset demonstrate that DGORNet outperforms state-of-the-art methods in terms of End-Point Error (EPE) and Root Mean Squared Error (RMSE), particularly in occlusion and texture-less…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
