A Class-wise Non-salient Region Generalized Framework for Video Semantic Segmentation
Yuhang Zhang, Shishun Tian, Muxin Liao, Zhengyu Zhang, Wenbin Zou,, Chen Xu

TL;DR
This paper introduces a novel class-wise non-salient region framework for video semantic segmentation that enhances domain generalization and consistency across frames, applicable to both video and image segmentation tasks.
Contribution
It proposes a new CNSG framework that leverages class-wise non-salient features and a reasoning strategy to improve generalization and consistency in semantic segmentation.
Findings
Significant performance improvements on VGSS and IGSS tasks.
Effective selection and enhancement of generalized features.
Reduction of predicted inconsistency across frames.
Abstract
Video semantic segmentation (VSS) is beneficial for dealing with dynamic scenes due to the continuous property of the real-world environment. On the one hand, some methods alleviate the predicted inconsistent problem between continuous frames. On the other hand, other methods employ the previous frame as the prior information to assist in segmenting the current frame. Although the previous methods achieve superior performances on the independent and identically distributed (i.i.d) data, they can not generalize well on other unseen domains. Thus, we explore a new task, the video generalizable semantic segmentation (VGSS) task that considers both continuous frames and domain generalization. In this paper, we propose a class-wise non-salient region generalized (CNSG) framework for the VGSS task. Concretely, we first define the class-wise non-salient feature, which describes features of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
