Towards High-Precision Depth Sensing via Monocular-Aided iToF and RGB Integration
Yansong Du, Yutong Deng, Yuting Zhou, Feiyu Jiao, Jian Song, Xun Guan

TL;DR
This paper introduces a fusion framework combining monocular priors, iToF, and RGB data to improve depth sensing accuracy, resolution, and field-of-view in complex scenes.
Contribution
It proposes a novel iToF-RGB fusion method with geometric calibration and a dual-encoder network for enhanced depth accuracy and resolution.
Findings
Outperforms state-of-the-art in accuracy and visual quality
Achieves better edge sharpness and depth consistency
Effectively expands field-of-view in complex scenes
Abstract
This paper presents a novel iToF-RGB fusion framework designed to address the inherent limitations of indirect Time-of-Flight (iToF) depth sensing, such as low spatial resolution, limited field-of-view (FoV), and structural distortion in complex scenes. The proposed method first reprojects the narrow-FoV iToF depth map onto the wide-FoV RGB coordinate system through a precise geometric calibration and alignment module, ensuring pixel-level correspondence between modalities. A dual-encoder fusion network is then employed to jointly extract complementary features from the reprojected iToF depth and RGB image, guided by monocular depth priors to recover fine-grained structural details and perform depth super-resolution. By integrating cross-modal structural cues and depth consistency constraints, our approach achieves enhanced depth accuracy, improved edge sharpness, and seamless FoV…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
