TL;DR
UniGeoSeg introduces a unified framework for open-world segmentation in geospatial scenes, leveraging a large-scale dataset and advanced training strategies to improve understanding and generalization.
Contribution
It presents the first million-scale remote sensing segmentation dataset, a new benchmark, and a unified model with novel training techniques for diverse instruction-driven tasks.
Findings
Achieved state-of-the-art results on GeoSeg-Bench and public benchmarks.
Demonstrated strong zero-shot generalization capabilities.
Provided datasets and code for future research.
Abstract
Instruction-driven segmentation in remote sensing generates masks from guidance, offering great potential for accessible and generalizable applications. However, existing methods suffer from fragmented task formulations and limited instruction data, hindering effective understanding and generalization. To address these issues, we introduce GeoSeg-1M, the first million-scale dataset for remote sensing instruction-driven segmentation, constructed via an automatic mask filtering and instruction generation pipeline that synthesizes referring, interactive, and reasoning segmentation instructions from multiple public datasets. GeoSeg-1M contains 590K images, 117 categories, and 1.1M image-mask-instruction triplets. Building upon this foundation, we further curate GeoSeg-Bench, a challenging benchmark designed to evaluate contextual understanding and reasoning capabilities across diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
