Rethinking Interactive Image Segmentation with Low Latency, High Quality, and Diverse Prompts
Qin Liu, Jaemin Cho, Mohit Bansal, Marc Niethammer

TL;DR
This paper introduces SegNext, a new interactive image segmentation method that combines low latency, high quality, and diverse prompts by reintroducing dense visual prompt fusion into generalist models, outperforming existing methods.
Contribution
It reintroduces dense prompt representation into generalist models to improve segmentation quality without sacrificing efficiency.
Findings
SegNext outperforms state-of-the-art methods on HQSeg-44K and DAVIS datasets.
Dense prompt fusion significantly enhances segmentation quality.
SegNext supports diverse prompts including clicks, boxes, polygons, scribbles, and masks.
Abstract
The goal of interactive image segmentation is to delineate specific regions within an image via visual or language prompts. Low-latency and high-quality interactive segmentation with diverse prompts remain challenging for existing specialist and generalist models. Specialist models, with their limited prompts and task-specific designs, experience high latency because the image must be recomputed every time the prompt is updated, due to the joint encoding of image and visual prompts. Generalist models, exemplified by the Segment Anything Model (SAM), have recently excelled in prompt diversity and efficiency, lifting image segmentation to the foundation model era. However, for high-quality segmentations, SAM still lags behind state-of-the-art specialist models despite SAM being trained with x100 more segmentation masks. In this work, we delve deep into the architectural differences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Medical Image Segmentation Techniques
MethodsSegment Anything Model
