Morphology-Guided Cross-Task Coupling for Joint Building Height and Footprint Estimation
Jinzhen Han, JinByeong Lee, Jisung Kim, HongSik Yun

TL;DR
This paper introduces MorphoFormer, a novel framework that explicitly encodes the coupling between building height and footprint using cross-attention and a consistency loss, improving estimation accuracy.
Contribution
It proposes two mechanisms—BF-Guided Task Decoder and Morphology Consistency Loss—that enhance joint BH and BF estimation by modeling their cross-task relationship.
Findings
MorphoFormer reduces BH RMSE from 3.39 to 3.15 meters.
The proposed mechanisms improve accuracy by approximately 0.24 meters.
The model maintains stable BF R^2 at 0.80 across experiments.
Abstract
Building height (BH) and building footprint (BF) jointly describe the vertical and horizontal extent of the built environment and are required inputs for urban climate, disaster-risk, and population-mapping models. The two parameters are coupled through floor-area-ratio (FAR) constraints, yet remote-sensing approaches typically treat them as independent regression targets. We argue that explicitly encoding this cross-task coupling is more impactful than further refining individual encoders, and propose MorphoFormer, a joint BH/BF estimation framework built around two complementary mechanisms: (i) a BF-Guided Task Decoder (BGTD) that gates the height branch via cross-attention on a footprint-derived morphology context, and (ii) a Morphology Consistency Loss (MCL) that supervises a height-from-footprint surrogate against the ground-truth BH, indirectly forcing the BF feature to encode…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
