3rd Place Scheme on Instance Segmentation Track of ICCV 2021 VIPriors   Challenges

Pengyu Chen; Wanhua Li

arXiv:2110.00242·cs.CV·November 8, 2022

3rd Place Scheme on Instance Segmentation Track of ICCV 2021 VIPriors Challenges

Pengyu Chen, Wanhua Li

PDF

Open Access

TL;DR

This paper presents a data-efficient instance segmentation approach based on a modified Swin Transformer, achieving competitive results in the ICCV 2021 VIPriors Challenge using only a single GPU.

Contribution

The authors developed a modified Swin Transformer method with data augmentation and multiscale fusion, demonstrating high performance with minimal hardware.

Findings

01

Achieved [email protected]:0.95 of 0.366 on test set

02

Ranked second in [email protected]:0.95 (medium) among contestants

03

Used only one GPU for training and testing

Abstract

In this paper, we introduce a data-efficient instance segmentation method we used in the 2021 VIPriors Instance Segmentation Challenge. Our solution is a modified version of Swin Transformer, based on the mmdetection which is a powerful toolbox. To solve the problem of lack of data, we utilize data augmentation including random flip and multiscale training to train our model. During inference, multiscale fusion is used to boost the performance. We only use a single GPU during the whole training and testing stages. In the end, our team achieved the result of 0.366 for [email protected]:0.95 on the test set, which is competitive with other top-ranking methods while only one GPU is used. Besides, our method achieved the [email protected]:0.95 (medium) of 0.592, which ranks second among all contestants. In the end, our team ranked third among all the contestants, as announced by the organizers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Video Analysis and Summarization

MethodsAttention Is All You Need · FLIP · Test · Linear Layer · Absolute Position Encodings · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Residual Connection · Byte Pair Encoding