OVerSeeC: Open-Vocabulary Costmap Generation from Satellite Images and Natural Language
Rwik Rana, Jesse Quattrociocchi, Dongmyeong Lee, Christian Ellis, Amanda Adkins, Adam Uccello, Garrett Warnell, Joydeep Biswas

TL;DR
This paper introduces OVerSeeC, a modular framework that uses foundation models to generate global costmaps from satellite images based on natural language instructions, enabling flexible, open-vocabulary, and mission-specific planning.
Contribution
The paper presents a novel zero-shot, modular approach combining language models and perception pipelines to generate costmaps from satellite imagery based on natural language, addressing limitations of fixed ontologies.
Findings
Handles novel entities and preferences effectively.
Produces routes aligned with human trajectories.
Demonstrates robustness to distribution shifts.
Abstract
Aerial imagery provides essential global context for autonomous navigation, enabling route planning at scales inaccessible to onboard sensing. We address the problem of generating global costmaps for long-range planning directly from satellite imagery when entities and mission-specific traversal rules are expressed in natural language at test time. This setting is challenging since mission requirements vary, terrain entities may be unknown at deployment, and user prompts often encode compositional traversal logic. Existing approaches relying on fixed ontologies and static cost mappings cannot accommodate such flexibility. While foundation models excel at language interpretation and open-vocabulary perception, no single model can simultaneously parse nuanced mission directives, locate arbitrary entities in large-scale imagery, and synthesize them into an executable cost function for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutomated Road and Building Extraction · Geographic Information Systems Studies · Multimodal Machine Learning Applications
