INSID3: Training-Free In-Context Segmentation with DINOv3

Claudia Cuttano; Gabriele Trivigno; Christoph Reich; Daniel Cremers; Carlo Masone; Stefan Roth

arXiv:2603.28480·cs.CV·March 31, 2026

INSID3: Training-Free In-Context Segmentation with DINOv3

Claudia Cuttano, Gabriele Trivigno, Christoph Reich, Daniel Cremers, Carlo Masone, Stefan Roth

PDF

1 Repo

TL;DR

INSID3 is a training-free method that leverages frozen DINOv3 features for versatile in-context segmentation, achieving state-of-the-art results without supervision or auxiliary models.

Contribution

It demonstrates that a single self-supervised backbone can support both semantic matching and segmentation without additional training or supervision.

Findings

01

Outperforms previous methods by +7.5% mIoU in segmentation tasks.

02

Uses 3x fewer parameters than prior approaches.

03

Operates without any mask or category-level supervision.

Abstract

In-context segmentation (ICS) aims to segment arbitrary concepts, e.g., objects, parts, or personalized instances, given one annotated visual examples. Existing work relies on (i) fine-tuning vision foundation models (VFMs), which improves in-domain results but harms generalization, or (ii) combines multiple frozen VFMs, which preserves generalization but yields architectural complexity and fixed segmentation granularities. We revisit ICS from a minimalist perspective and ask: Can a single self-supervised backbone support both semantic matching and segmentation, without any supervision or auxiliary models? We show that scaled-up dense self-supervised features from DINOv3 exhibit strong spatial structure and semantic correspondence. We introduce INSID3, a training-free approach that segments concepts at varying granularities only from frozen DINOv3 features, given an in-context example.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

visinf/INSID3
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.