Enhancing Monocular Height Estimation via Sparse LiDAR-Guided Correction
Jian Song, Hongruixuan Chen, Naoto Yokoya

TL;DR
This paper presents a novel automated pipeline that combines sparse global LiDAR data with deep learning to significantly improve monocular height estimation accuracy across diverse environments, enabling scalable 3D mapping.
Contribution
It introduces a fully automated correction method integrating LiDAR and deep learning, establishing the first benchmark for this approach and demonstrating substantial accuracy improvements.
Findings
Reduces MHE MAE by 30.9%
Improves F1HE score by 44.2% for MHE
Enhances MDE MAE by 24.1%
Abstract
Monocular height estimation (MHE) from very-high-resolution (VHR) optical imagery remains challenging due to limited structural cues and the high cost and geographic constraints of conventional elevation data such as airborne LiDAR and multi-view stereo. Although recent MHE and monocular depth estimation (MDE) models show strong performance, their robustness under varied illumination and scene conditions is still limited. We introduce a fully automated correction pipeline that integrates sparse, imperfect global LiDAR measurements from ICESat-2 with deep learning predictions to enhance accuracy and stability. The workflow relies entirely on publicly available models and data and requires only a single georeferenced optical image to produce corrected height maps, enabling low-cost and globally scalable deployment. We also establish the first benchmark for this task, evaluating two random…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
