ZIM: Zero-Shot Image Matting for Anything
Beomyoung Kim, Chanyong Shin, Joonhyun Jeong, Hyungsik Jung, Se-Yun Lee, Sewhan Chun, Dong-Hyun Hwang, Joonsang Yu

TL;DR
ZIM introduces a zero-shot image matting model that leverages a novel dataset and architecture to produce fine-grained masks, enhancing zero-shot segmentation and enabling diverse applications like inpainting and 3D modeling.
Contribution
The paper presents a new zero-shot image matting approach with a label converter, a hierarchical pixel decoder, and a prompt-aware attention mechanism, improving fine-grained mask generation without manual annotations.
Findings
ZIM outperforms existing methods in fine-grained mask generation.
ZIM demonstrates strong zero-shot generalization across tasks.
The approach enables applications like image inpainting and 3D NeRF.
Abstract
The recent segmentation foundation model, Segment Anything Model (SAM), exhibits strong zero-shot segmentation capabilities, but it falls short in generating fine-grained precise masks. To address this limitation, we propose a novel zero-shot image matting model, called ZIM, with two key contributions: First, we develop a label converter that transforms segmentation labels into detailed matte labels, constructing the new SA1B-Matte dataset without costly manual annotations. Training SAM with this dataset enables it to generate precise matte masks while maintaining its zero-shot capability. Second, we design the zero-shot matting model equipped with a hierarchical pixel decoder to enhance mask representation, along with a prompt-aware masked attention mechanism to improve performance by enabling the model to focus on regions specified by visual prompts. We evaluate ZIM using the newly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Advanced Steganography and Watermarking Techniques · QR Code Applications and Technologies
MethodsSoftmax · Attention Is All You Need · Inpainting · Segment Anything Model · Focus
