GeomPrompt: Geometric Prompt Learning for RGB-D Semantic Segmentation Under Missing and Degraded Depth
Krishna Jaganathan, Patricio Vela

TL;DR
GeomPrompt introduces a lightweight, task-driven geometric prompt module that enhances RGB-D semantic segmentation robustness under missing or degraded depth, without requiring depth supervision.
Contribution
The paper proposes GeomPrompt and GeomPrompt-Recovery modules that synthesize geometric prompts from RGB alone and compensate for degraded depth, improving segmentation accuracy and efficiency.
Findings
GeomPrompt improves segmentation by +6.1 mIoU on SUN RGB-D.
GeomPrompt-Recovery enhances robustness under severe depth corruption.
The modules operate with only downstream segmentation supervision.
Abstract
Multimodal perception systems for robotics and embodied AI often assume reliable RGB-D sensing, but in practice, depth is frequently missing, noisy, or corrupted. We thus present GeomPrompt, a lightweight cross-modal adaptation module that synthesizes a task-driven geometric prompt from RGB alone for the fourth channel of a frozen RGB-D semantic segmentation model, without depth supervision. We further introduce GeomPrompt-Recovery, an adaptation module that compensates for degraded depth by predicting the fourth channel correction relevant for the frozen segmenter. Both modules are trained solely with downstream segmentation supervision, enabling recovery of the geometric prior useful for segmentation, rather than estimating depth signals. On SUN RGB-D, GeomPrompt improves over RGB-only inference by +6.1 mIoU on DFormer and +3.0 mIoU on GeminiFusion, while remaining competitive with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
