An End-to-End Trainable Video Panoptic Segmentation Method   usingTransformers

Jeongwon Ryu; Kwangjin Yoon

arXiv:2110.04009·cs.CV·October 11, 2021

An End-to-End Trainable Video Panoptic Segmentation Method usingTransformers

Jeongwon Ryu, Kwangjin Yoon

PDF

Open Access

TL;DR

This paper introduces an end-to-end trainable video panoptic segmentation method utilizing transformers, capable of generating unified segmentation and tracking results across video sequences, and demonstrates competitive performance on benchmark datasets.

Contribution

The paper presents a novel transformer-based algorithm for video panoptic segmentation that can be trained end-to-end, unifying segmentation and tracking tasks in a single framework.

Findings

01

Achieved 57.81% on KITTI-STEP dataset

02

Achieved 31.8% on MOTChallenge-STEP dataset

03

Demonstrated effective end-to-end training for video segmentation and tracking

Abstract

In this paper, we present an algorithm to tackle a video panoptic segmentation problem, a newly emerging area of research. The video panoptic segmentation is a task that unifies the typical task of panoptic segmentation and multi-object tracking. In other words, it requires generating the instance tracking IDs along with panoptic segmentation results across video sequences. Our proposed video panoptic segmentation algorithm uses the transformer and it can be trained in end-to-end with an input of multiple video frames. We test our method on the STEP dataset and report its performance with recently proposed STQ metric. The method archived 57.81\% on the KITTI-STEP dataset and 31.8\% on the MOTChallenge-STEP dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection

MethodsTest