Scale-Aware Recognition in Satellite Images under Resource Constraints
Shreelekha Revankar, Cheng Perng Phoo, Utkarsh Mall, Bharath, Hariharan, Kavita Bala

TL;DR
This paper introduces a scale-aware recognition system for satellite images that intelligently balances resolution and accuracy, leveraging knowledge distillation, disagreement-based sampling, and language models to optimize resource use and improve recognition performance.
Contribution
It presents a novel multi-component framework combining knowledge distillation, disagreement sampling, and scale inference to enhance satellite image recognition under resource constraints.
Findings
Achieves up to 26.3% accuracy improvement over HR baselines.
Uses 76.3% fewer high-resolution images.
Demonstrates effective scale inference with LLMs.
Abstract
Recognition of features in satellite imagery (forests, swimming pools, etc.) depends strongly on the spatial scale of the concept and therefore the resolution of the images. This poses two challenges: Which resolution is best suited for recognizing a given concept, and where and when should the costlier higher-resolution (HR) imagery be acquired? We present a novel scheme to address these challenges by introducing three components: (1) A technique to distill knowledge from models trained on HR imagery to recognition models that operate on imagery of lower resolution (LR), (2) a sampling strategy for HR imagery based on model disagreement, and (3) an LLM-based approach for inferring concept "scale". With these components we present a system to efficiently perform scale-aware recognition in satellite imagery, improving accuracy over single-scale inference while following budget…
Peer Reviews
Decision·ICLR 2025 Poster
1. A novel idea of scale determination is proposed for targer recognition using reasonable resolution remote sensing images. 2. Low resolution detection model using distillation strategy is taken as the replacement of high resolution detection model. 3. Two benchmark datasets are constructed for research.
1. The budget constraints seems actually impose no impact on the final result. 2. The processes such as distillation, concept scale LLM training are not provided the details. 3. The retrieval process has no systematic description to include all the subprocesses.
1. This paper presents a novel framework to perform the tradeoff between cost and accuracy. 2. The topic is interesting and valuable. 3. The writing is clear. 4. It is good that the authors plan to release their data and download scripts.
1. It is unclear how the system handles noisy labels using OpenStreetMap annotations. 2. Can the proposed system retrieve multiple targets simultaneously? 3. How to deal with the time discrepancy between high- and low-resolution images, which can lead to disagreements between the LR and HR models. Did the authors consider the impact of this issue? 4. No mention of runtime in Table 3.
This paper focuses on a highly interesting problem with significant industrial application potential. The proposed knowledge distillation method and model disagreement-based sampling strategy are both innovative and effective. The distillation enhances the performance of LR models, while the sampling strategy efficiently balances the trade-off between cost and accuracy.
The approach heavily relies on the examples provided in the prompt for the LLM. The model’s understanding of the relationship between manually defined categories and image resolutions is largely based on these examples. This dependence raises concerns about the generalizability of the method to niche or less common satellite imagery concepts. The paper does not include experiments comparing the proposed method with techniques that generate HR images via super-resolution, then combine these imag
Videos
Taxonomy
TopicsImage Processing and 3D Reconstruction · Geochemistry and Geologic Mapping · Image Retrieval and Classification Techniques
