Static-Dynamic Class-level Perception Consistency in Video Semantic Segmentation
Zhigang Cen, Ningyan Guo, Wenjing Xu, Zhiyong Feng, Danlan Huang

TL;DR
This paper introduces a novel class-level perceptual consistency framework for video semantic segmentation, leveraging class prototypes and multi-scale alignment to improve temporal consistency and segmentation accuracy.
Contribution
It proposes a static-dynamic class-level perceptual consistency framework with contrastive learning and semantic alignment modules, advancing beyond pixel-level methods.
Findings
Outperforms state-of-the-art on VSPW and Cityscapes datasets
Reduces computational cost with a window-based attention method
Enhances temporal semantic consistency in video segmentation
Abstract
Video semantic segmentation(VSS) has been widely employed in lots of fields, such as simultaneous localization and mapping, autonomous driving and surveillance. Its core challenge is how to leverage temporal information to achieve better segmentation. Previous efforts have primarily focused on pixel-level static-dynamic contexts matching, utilizing techniques such as optical flow and attention mechanisms. Instead, this paper rethinks static-dynamic contexts at the class level and proposes a novel static-dynamic class-level perceptual consistency (SD-CPC) framework. In this framework, we propose multivariate class prototype with contrastive learning and a static-dynamic semantic alignment module. The former provides class-level constraints for the model, obtaining personalized inter-class features and diversified intra-class features. The latter first establishes intra-frame spatial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques
MethodsSoftmax · Attention Is All You Need · Contrastive Learning
