5th Place Solution for VSPW 2021 Challenge
Jiafan Zhuang, Yixin Zhang, Xinyu Hu, Junjie Li, Zilei Wang

TL;DR
This paper presents a solution for the VSPW 2021 Challenge that combines baseline models with ensemble techniques, achieving 5th place without external data and exploring methods to address long-tail recognition and overfitting.
Contribution
The authors introduce a novel ensemble strategy and apply stochastic weight averaging to improve semantic segmentation performance in the challenge.
Findings
Achieved 5th place in VSPW 2021 Challenge without external datasets.
Ensemble and stochastic weight averaging improved segmentation results.
Long-tail and overfitting techniques showed potential but had limited success on test data.
Abstract
In this article, we introduce the solution we used in the VSPW 2021 Challenge. Our experiments are based on two baseline models, Swin Transformer and MaskFormer. To further boost performance, we adopt stochastic weight averaging technique and design hierarchical ensemble strategy. Without using any external semantic segmentation dataset, our solution ranked the 5th place in the private leaderboard. Besides, we have some interesting attempts to tackle long-tail recognition and overfitting issues, which achieves improvement on val subset. Maybe due to distribution difference, these attempts don't work on test subset. We will also introduce these attempts and hope to inspire other researchers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Label Smoothing · Byte Pair Encoding · Softmax · Stochastic Weight Averaging · Absolute Position Encodings · Stochastic Depth
