RT-K-Net: Revisiting K-Net for Real-Time Panoptic Segmentation
Markus Sch\"on, Michael Buchholz, Klaus Dietmayer

TL;DR
RT-K-Net is a real-time panoptic segmentation model that revisits K-Net architecture, making key improvements to significantly reduce latency and enhance performance on challenging datasets.
Contribution
The paper introduces vital modifications to K-Net for real-time applications, achieving state-of-the-art results in speed and accuracy for panoptic segmentation.
Findings
Achieves 60.2% PQ on Cityscapes at 32 ms inference time.
Reaches 33.2% PQ on Mapillary Vistas with 69 ms inference.
Significantly reduces latency compared to previous methods.
Abstract
Panoptic segmentation is one of the most challenging scene parsing tasks, combining the tasks of semantic segmentation and instance segmentation. While much progress has been made, few works focus on the real-time application of panoptic segmentation methods. In this paper, we revisit the recently introduced K-Net architecture. We propose vital changes to the architecture, training, and inference procedure, which massively decrease latency and improve performance. Our resulting RT-K-Net sets a new state-of-the-art performance for real-time panoptic segmentation methods on the Cityscapes dataset and shows promising results on the challenging Mapillary Vistas dataset. On Cityscapes, RT-K-Net reaches 60.2 % PQ with an average inference time of 32 ms for full resolution 1024x2048 pixel images on a single Titan RTX GPU. On Mapillary Vistas, RT-K-Net reaches 33.2 % PQ with an average…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
MethodsK-Net
