Improving Semantic Segmentation via Self-Training
Yi Zhu, Zhongyue Zhang, Chongruo Wu, Zhi Zhang, Tong He, Hang Zhang,, R. Manmatha, Mu Li, Alexander Smola

TL;DR
This paper presents a semi-supervised self-training method for semantic segmentation that achieves state-of-the-art results with less supervision by combining labeled and pseudo-labeled data, and introduces a fast training schedule.
Contribution
It introduces a robust self-training framework for semantic segmentation that leverages pseudo labels and a fast training schedule, improving efficiency and performance.
Findings
State-of-the-art results on Cityscapes, CamVid, and KITTI datasets.
Effective cross-domain generalization surpassing fine-tuning.
Training acceleration up to 2x without performance loss.
Abstract
Deep learning usually achieves the best results with complete supervision. In the case of semantic segmentation, this means that large amounts of pixelwise annotations are required to learn accurate models. In this paper, we show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm. We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data. Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets while requiring significantly less supervision. We also demonstrate the effectiveness of self-training on a challenging cross-domain generalization task, outperforming conventional finetuning method by a large margin. Lastly, to alleviate the computational burden caused by the large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
