Memory Based Video Scene Parsing

Zhenchao Jin; Dongdong Yu; Kai Su; Zehuan Yuan; Changhu Wang

arXiv:2109.00373·cs.CV·September 2, 2021·1 cites

Memory Based Video Scene Parsing

Zhenchao Jin, Dongdong Yu, Kai Su, Zehuan Yuan, Changhu Wang

PDF

Open Access

TL;DR

This paper presents a memory-based approach for video scene parsing, leveraging temporal information to improve pixel-wise semantic labeling across video frames, achieving competitive results in a challenging benchmark.

Contribution

The paper introduces a memory-based method specifically designed for video scene parsing, addressing the challenge of utilizing temporal data effectively.

Findings

01

Achieved a mIoU of 57.44 on the challenge dataset

02

Secured 2nd place in the Video Scene Parsing in the Wild Challenge

03

Demonstrated the effectiveness of memory-based techniques in video semantic segmentation

Abstract

Video scene parsing is a long-standing challenging task in computer vision, aiming to assign pre-defined semantic labels to pixels of all frames in a given video. Compared with image semantic segmentation, this task pays more attention on studying how to adopt the temporal information to obtain higher predictive accuracy. In this report, we introduce our solution for the 1st Video Scene Parsing in the Wild Challenge, which achieves a mIoU of 57.44 and obtained the 2nd place (our team name is CharlesBLWX).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning