DINO Soars: DINOv3 for Open-Vocabulary Semantic Segmentation of Remote Sensing Imagery

Ryan Faulkenberry; Saurabh Prasad

arXiv:2605.03175·cs.CV·May 6, 2026

DINO Soars: DINOv3 for Open-Vocabulary Semantic Segmentation of Remote Sensing Imagery

Ryan Faulkenberry, Saurabh Prasad

PDF

1 Repo

TL;DR

This paper introduces CAFe-DINO, a zero-shot open-vocabulary semantic segmentation model for remote sensing imagery that leverages DINOv3's strong foundation without domain-specific fine-tuning.

Contribution

It develops a novel RS segmentation approach using DINOv3's capabilities, achieving state-of-the-art results without RS-specific fine-tuning.

Findings

01

CAFe-DINO outperforms fine-tuned OVSS methods on RS datasets.

02

DINOv3's backbone enables effective zero-shot RS segmentation.

03

The model is trained on a subset of COCO-Stuff and performs well on RS imagery.

Abstract

The remote sensing (RS) domain suffers from a lack of densely labeled datasets, which are costly to obtain. Thus, models that can segment RS imagery well without supervised fine-tuning are valuable, but existing solutions fall behind supervised methods. Recently, DINOv3 surpassed SOTA RS foundation models on the GEO-bench segmentation benchmark without pre-training on RS data. Additionally, DINO.txt has enabled open vocabulary semantic segmentation (OVSS) with the DINOv3 backbone. We leverage these developments to form an OVSS model for RS imagery, free of RS-domain fine-tuning. Our model, CAFe-DINO (Cost Aggregation + Feature Upsampling with DINO) exploits the strong OVSS performance of DINOv3 for RS imagery via cost aggregation and training-free upsampling of text-image similarity scores. The robust latent of the DINOv3 backbone eliminates the need for fine-tuning on RS imagery; we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rfaulk/DINO_Soars
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.