Video Semantic Object Segmentation by Self-Adaptation of DCNN
Seong-Jin Park, Ki-Sang Hong

TL;DR
This paper introduces a self-adaptive framework for video semantic object segmentation that leverages confidently-estimated frames to improve label consistency across video frames, enhancing segmentation accuracy.
Contribution
It presents a novel self-adaptation approach for DCNNs in video segmentation, using confident frames to refine labels and improve performance over existing methods.
Findings
Significant improvement over baseline models
Effective offline and online adaptation strategies
Outperforms previous state-of-the-art methods
Abstract
This paper proposes a new framework for semantic segmentation of objects in videos. We address the label inconsistency problem of deep convolutional neural networks (DCNNs) by exploiting the fact that videos have multiple frames; in a few frames the object is confidently-estimated (CE) and we use the information in them to improve labels of the other frames. Given the semantic segmentation results of each frame obtained from DCNN, we sample several CE frames to adapt the DCNN model to the input video by focusing on specific instances in the video rather than general objects in various circumstances. We propose offline and online approaches under different supervision levels. In experiments our method achieved great improvement over the original model and previous state-of-the-art methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsDiffusion-Convolutional Neural Networks
