Personalized OVSS: Understanding Personal Concept in Open-Vocabulary Semantic Segmentation

Sunghyun Park; Jungsoo Lee; Shubhankar Borse; Munawar Hayat; Sungha Choi; Kyuwoong Hwang; Fatih Porikli

arXiv:2507.11030·cs.CV·July 16, 2025

Personalized OVSS: Understanding Personal Concept in Open-Vocabulary Semantic Segmentation

Sunghyun Park, Jungsoo Lee, Shubhankar Borse, Munawar Hayat, Sungha Choi, Kyuwoong Hwang, Fatih Porikli

PDF

Open Access

TL;DR

This paper introduces personalized open-vocabulary semantic segmentation, enabling models to recognize individual-specific concepts like 'my mug' by fine-tuning with minimal data, while preserving general segmentation capabilities.

Contribution

It proposes a novel plug-in method using text prompt tuning and negative mask proposals for personalized concept recognition in OVSS, with minimal performance loss.

Findings

01

Outperforms existing methods on new personalized benchmarks.

02

Effectively recognizes personal concepts with few image-mask pairs.

03

Maintains original OVSS performance while personalizing segmentation.

Abstract

While open-vocabulary semantic segmentation (OVSS) can segment an image into semantic regions based on arbitrarily given text descriptions even for classes unseen during training, it fails to understand personal texts (e.g., `my mug cup') for segmenting regions of specific interest to users. This paper addresses challenges like recognizing `my mug cup' among `multiple mug cups'. To overcome this challenge, we introduce a novel task termed \textit{personalized open-vocabulary semantic segmentation} and propose a text prompt tuning-based plug-in method designed to recognize personal visual concepts using a few pairs of images and masks, while maintaining the performance of the original OVSS. Based on the observation that reducing false predictions is essential when applying text prompt tuning to this task, our proposed method employs `negative mask proposal' that captures visual concepts…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems