StruMPL: Multi-task Dense Regression under Disjoint Partial Supervision and MNAR Labels
Reza M. Asiyabi, Juan Alberto Molina-Valero, The SEOSAW Partnership, Steven Hancock, Casey M. Ryan

TL;DR
StruMPL is a novel multi-task deep learning framework that estimates forest biomass from heterogeneous, partially labeled Earth observation data by integrating physical laws, addressing missing data biases, and correcting for non-random missingness.
Contribution
It introduces a joint model with a physics-informed module and a novel loss function to effectively combine disjoint, biased datasets for biomass estimation.
Findings
StruMPL outperforms existing methods on biomass RMSE and bias.
AIPW reduces high-AGB bias by approximately 54%.
The model effectively integrates structural and biomass data under complex supervision conditions.
Abstract
Estimating forest aboveground biomass (AGB) from Earth observation combines two structurally incompatible label sources: spaceborne lidar provides canopy structure at millions of locations but no biomass estimate, and ground-based plots provide biomass at thousands of biased locations but no metrics of structure. No single training sample carries labels for all target variables, plot labels are missing not at random (MNAR), and biomass is linked to the structural variables by known but biome-specific allometric laws. We formalise this as multi-task dense regression under heterogeneous disjoint partial supervision with MNAR labels and inter-task physical constraints, and propose StruMPL to address it jointly. A shared encoder feeds per-variable regression, imputation, and propensity heads for spatial MNAR correction, and a learnable physics module that evaluates the inter-task constraint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
