Enhancing Monocular Height Estimation via Weak Supervision from Imperfect Labels
Sining Chen, Yilei Shi, Xiao Xiang Zhu

TL;DR
This paper proposes a weak supervision approach using imperfect labels from different regions to improve monocular height estimation, enhancing cross-domain accuracy with a novel ensemble pipeline and loss functions.
Contribution
It introduces a versatile ensemble-based pipeline with specialized loss functions to leverage noisy labels for better height estimation across diverse regions.
Findings
Reduces RMSE by up to 22.94% on DFC23 dataset.
Improves cross-domain performance with more consistent results.
Ablation studies validate each component's effectiveness.
Abstract
Monocular height estimation provides an efficient and cost-effective solution for three-dimensional perception in remote sensing. However, training deep neural networks for this task demands abundant annotated data, while high-quality labels are scarce and typically available only in developed regions, which limits model generalization and constrains their applicability at large scales. This work addresses the problem by leveraging imperfect labels from out-of-domain regions to train pixel-wise height estimation networks, which may be incomplete, inexact, or inaccurate compared to high-quality annotations. We introduce an ensemble-based pipeline compatible with any monocular height estimation network, featuring architecture and loss functions specifically designed to leverage information in noisy labels through weak supervision, utilizing balanced soft losses and ordinal constraints.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGlaucoma and retinal disorders · Advanced Vision and Imaging · Image and Object Detection Techniques
