Segmenting Transparent Object in the Wild with Transformer

Enze Xie; Wenjia Wang; Wenhai Wang; Peize Sun; Hang Xu; Ding Liang,; Ping Luo

arXiv:2101.08461·cs.CV·February 24, 2021·25 cites

Segmenting Transparent Object in the Wild with Transformer

Enze Xie, Wenjia Wang, Wenhai Wang, Peize Sun, Hang Xu, Ding Liang,, Ping Luo

PDF

Open Access 2 Repos 1 Datasets

TL;DR

This paper introduces Trans10K-v2, a comprehensive transparent object segmentation dataset with 11 categories, and proposes Trans2Seg, a transformer-based segmentation method that outperforms CNN-based approaches, advancing real-world transparent object segmentation.

Contribution

The paper presents a new large-scale dataset for transparent object segmentation and a novel transformer-based segmentation pipeline that outperforms existing CNN-based methods.

Findings

01

Trans2Seg significantly outperforms CNN-based segmentation methods.

02

Trans10K-v2 provides more challenging and fine-grained transparent object data.

03

Transformer encoder offers global receptive field advantages.

Abstract

This work presents a new fine-grained transparent object segmentation dataset, termed Trans10K-v2, extending Trans10K-v1, the first large-scale transparent object segmentation dataset. Unlike Trans10K-v1 that only has two limited categories, our new dataset has several appealing benefits. (1) It has 11 fine-grained categories of transparent objects, commonly occurring in the human domestic environment, making it more practical for real-world application. (2) Trans10K-v2 brings more challenges for the current advanced segmentation methods than its former version. Furthermore, a novel transformer-based segmentation pipeline termed Trans2Seg is proposed. Firstly, the transformer encoder of Trans2Seg provides the global receptive field in contrast to CNN's local receptive field, which shows excellent advantages over pure CNN architectures. Secondly, by formulating semantic segmentation as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

rdyzakya/Trans10K-v2
dataset· 21 dl
21 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques