Monocular Building Height Estimation from PhiSat-2 Imagery: Dataset and Method
Yanjiao Song, Bowen Cai, Timo Balz, Zhenfeng Shao, Neema Simon Sumari, James Magidi, Walter Musakwa

TL;DR
This paper introduces a new dataset and a novel neural network method for estimating building heights from PhiSat-2 satellite imagery, demonstrating improved accuracy and robustness.
Contribution
It constructs the first PhiSat-2-based building height dataset and proposes TSONet, a two-stream network with modules for footprint-aware feature interaction and ordinal height refinement.
Findings
TSONet reduces MAE and RMSE by over 13% and 9.7% respectively.
TSONet improves IoU and F1-score by over 14% and 10%.
PhiSat-2 imagery enhances monocular building height estimation.
Abstract
Monocular building height estimation from optical imagery is important for urban morphology characterization but remains challenging due to ambiguous height cues, large inter-city variations in building morphology, and the long-tailed distribution of building heights. PhiSat-2 is a promising open-access data source for this task because of its global coverage, 4.75 m spatial resolution, and seven-band spectral observations, yet its potential has not been systematically evaluated. To address this gap, we construct a PhiSat-2-Height dataset (PHDataset) and propose a Two-Stream Ordinal Network (TSONet). PHDataset contains 9,475 co-registered image-label patch pairs from 26 cities worldwide. TSONet jointly models footprint segmentation and height estimation, and introduces a Cross-Stream Exchange Module (CSEM) and a Feature-Enhanced Bin Refinement (FEBR) module for footprint-aware feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
