Improving Semantic Segmentation via Video Propagation and Label   Relaxation

Yi Zhu; Karan Sapra; Fitsum A. Reda; Kevin J. Shih; Shawn Newsam,; Andrew Tao; Bryan Catanzaro

arXiv:1812.01593·cs.CV·July 4, 2019·37 cites

Improving Semantic Segmentation via Video Propagation and Label Relaxation

Yi Zhu, Karan Sapra, Fitsum A. Reda, Kevin J. Shih, Shawn Newsam,, Andrew Tao, Bryan Catanzaro

PDF

Open Access 5 Repos

TL;DR

This paper introduces a video prediction-based data augmentation and label relaxation technique to improve semantic segmentation accuracy, achieving state-of-the-art results on multiple datasets.

Contribution

It proposes a novel joint propagation strategy and boundary label relaxation to enhance training robustness and segmentation performance.

Findings

01

Achieved 83.5% mIoU on Cityscapes

02

Surpassed 2018 ROB challenge winning entry on KITTI

03

Significant accuracy improvements with synthesized samples

Abstract

Semantic segmentation requires large amounts of pixel-wise annotations to learn accurate models. In this paper, we present a video prediction-based methodology to scale up training sets by synthesizing new training samples in order to improve the accuracy of semantic segmentation networks. We exploit video prediction models' ability to predict future frames in order to also predict future labels. A joint propagation strategy is also proposed to alleviate mis-alignments in synthesized samples. We demonstrate that training segmentation models on datasets augmented by the synthesized samples leads to significant improvements in accuracy. Furthermore, we introduce a novel boundary label relaxation technique that makes training robust to annotation noise and propagation artifacts along object boundaries. Our proposed methods achieve state-of-the-art mIoUs of 83.5% on Cityscapes and 82.9% on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning