Morphology-Guided Cross-Task Coupling for Joint Building Height and Footprint Estimation

Jinzhen Han; JinByeong Lee; Jisung Kim; HongSik Yun

arXiv:2605.04731·cs.CV·May 7, 2026

Morphology-Guided Cross-Task Coupling for Joint Building Height and Footprint Estimation

Jinzhen Han, JinByeong Lee, Jisung Kim, HongSik Yun

PDF

TL;DR

This paper introduces MorphoFormer, a novel framework that explicitly encodes the coupling between building height and footprint using cross-attention and a consistency loss, improving estimation accuracy.

Contribution

It proposes two mechanisms—BF-Guided Task Decoder and Morphology Consistency Loss—that enhance joint BH and BF estimation by modeling their cross-task relationship.

Findings

01

MorphoFormer reduces BH RMSE from 3.39 to 3.15 meters.

02

The proposed mechanisms improve accuracy by approximately 0.24 meters.

03

The model maintains stable BF R^2 at 0.80 across experiments.

Abstract

Building height (BH) and building footprint (BF) jointly describe the vertical and horizontal extent of the built environment and are required inputs for urban climate, disaster-risk, and population-mapping models. The two parameters are coupled through floor-area-ratio (FAR) constraints, yet remote-sensing approaches typically treat them as independent regression targets. We argue that explicitly encoding this cross-task coupling is more impactful than further refining individual encoders, and propose MorphoFormer, a joint BH/BF estimation framework built around two complementary mechanisms: (i) a BF-Guided Task Decoder (BGTD) that gates the height branch via cross-attention on a footprint-derived morphology context, and (ii) a Morphology Consistency Loss (MCL) that supervises a height-from-footprint surrogate against the ground-truth BH, indirectly forcing the BF feature to encode…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.