MatteFormer: Transformer-Based Image Matting via Prior-Tokens

GyuTae Park; SungJoon Son; JaeYoung Yoo; SeHo Kim; Nojun Kwak

arXiv:2203.15662·cs.CV·March 30, 2022·6 cites

MatteFormer: Transformer-Based Image Matting via Prior-Tokens

GyuTae Park, SungJoon Son, JaeYoung Yoo, SeHo Kim, Nojun Kwak

PDF

Open Access 1 Repo

TL;DR

MatteFormer is a transformer-based image matting model that leverages prior-tokens representing trimap regions to improve alpha matte prediction, achieving state-of-the-art results.

Contribution

The paper introduces prior-tokens and a Prior-Attentive Swin Transformer (PAST) block to effectively incorporate trimap information into transformer-based image matting.

Findings

01

Achieves state-of-the-art performance on Composition-1k and Distinctions-646 datasets.

02

Utilizes prior-tokens for global trimap region representation.

03

Demonstrates significant margin improvements over previous methods.

Abstract

In this paper, we propose a transformer-based image matting model called MatteFormer, which takes full advantage of trimap information in the transformer block. Our method first introduces a prior-token which is a global representation of each trimap region (e.g. foreground, background and unknown). These prior-tokens are used as global priors and participate in the self-attention mechanism of each block. Each stage of the encoder is composed of PAST (Prior-Attentive Swin Transformer) block, which is based on the Swin Transformer block, but differs in a couple of aspects: 1) It has PA-WSA (Prior-Attentive Window Self-Attention) layer, performing self-attention not only with spatial-tokens but also with prior-tokens. 2) It has prior-memory which saves prior-tokens accumulatively from the previous blocks and transfers them to the next block. We evaluate our MatteFormer on the commonly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

webtoon/matteformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Enhancement Techniques · Image and Signal Denoising Methods · Visual Attention and Saliency Detection

MethodsAttention Is All You Need · Swin Transformer · Transformer