Shift and matching queries for video semantic segmentation

Tsubasa Mizuno; Toru Tamaki

arXiv:2410.07635·cs.CV·October 11, 2024

Shift and matching queries for video semantic segmentation

Tsubasa Mizuno, Toru Tamaki

PDF

Open Access

TL;DR

This paper introduces a novel method for video semantic segmentation that extends query-based image segmentation models to videos by using feature shift and query matching, improving temporal consistency and segmentation quality.

Contribution

The paper proposes a new approach combining feature shift and query matching to adapt image segmentation models for video, ensuring consistent mask tracking across frames.

Findings

01

Significant performance improvements on CityScapes-VPS and VSPW datasets.

02

Enhanced temporal consistency in video segmentation results.

03

Efficient reuse of pre-trained weights in the proposed method.

Abstract

Video segmentation is a popular task, but applying image segmentation models frame-by-frame to videos does not preserve temporal consistency. In this paper, we propose a method to extend a query-based image segmentation model to video using feature shift and query matching. The method uses a query-based architecture, where decoded queries represent segmentation masks. These queries should be matched before performing the feature shift to ensure that the shifted queries represent the same mask across different frames. Experimental results on CityScapes-VPS and VSPW show significant improvements from the baselines, highlighting the method's effectiveness in enhancing segmentation quality while efficiently reusing pre-trained weights.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Medical Image Segmentation Techniques