Observe Less, Understand More: Cost-aware Cross-scale Observation for Remote Sensing Understanding
Zhenghao Xie, Jing Xiao, Zhenqi Wang, Kexin Ma, Liang Liao, Gui-Song Xia, Mi Wang

TL;DR
This paper introduces a cost-aware cross-scale observation method for remote sensing that optimally combines low- and high-resolution imagery to improve understanding while minimizing acquisition costs.
Contribution
It formulates a unified approach coupling fine-grained high-resolution sampling with cross-patch representation prediction for better scene reasoning under cost constraints.
Findings
Our method outperforms existing approaches in recognition tasks.
It achieves a better performance-cost trade-off in remote sensing understanding.
The GL-10M benchmark enables systematic evaluation of cross-scale reasoning.
Abstract
Remote sensing understanding inherently requires multi-resolution observation, since different targets and application tasks demand different levels of spatial detail. While low-resolution (LR) imagery enables efficient global observation, high-resolution (HR) imagery provides critical local details at much higher acquisition cost and limited coverage. This motivates a cross-scale sensing strategy that selectively acquires HR imagery from LR-based global perception to improve task performance under constrained cost. Existing methods for HR sampling methods typically make selection decisions from isolated LR patches, which ignore fine-grained intra-patch importance and cross-patch contextual interactions, leading to fragmented feature representation and suboptimal scene reasoning under sparse HR observations. To address this issue, we formulate cross-scale remote sensing understanding as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
