PanSR: An Object-Centric Mask Transformer for Panoptic Segmentation

Lojze \v{Z}ust; Matej Kristan

arXiv:2412.10589·cs.CV·December 17, 2024

PanSR: An Object-Centric Mask Transformer for Panoptic Segmentation

Lojze \v{Z}ust, Matej Kristan

PDF

Open Access 1 Repo

TL;DR

PanSR introduces an object-centric mask transformer that improves small object detection and scene segmentation in crowded scenes, achieving state-of-the-art results on multiple benchmarks.

Contribution

It proposes a novel panoptic segmentation method that addresses key shortcomings of existing mask-transformer approaches, notably enhancing small object detection and reducing instance merging.

Findings

01

+3.4 PQ improvement on LaRS benchmark

02

State-of-the-art performance on Cityscapes

03

Effective mitigation of instance merging and small-object detection

Abstract

Panoptic segmentation is a fundamental task in computer vision and a crucial component for perception in autonomous vehicles. Recent mask-transformer-based methods achieve impressive performance on standard benchmarks but face significant challenges with small objects, crowded scenes and scenes exhibiting a wide range of object scales. We identify several fundamental shortcomings of the current approaches: (i) the query proposal generation process is biased towards larger objects, resulting in missed smaller objects, (ii) initially well-localized queries may drift to other objects, resulting in missed detections, (iii) spatially well-separated instances may be merged into a single mask causing inconsistent and false scene interpretations. To address these issues, we rethink the individual components of the network and its supervision, and propose a novel method for panoptic segmentation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lojzezust/pansr
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Handwritten Text Recognition Techniques · Image and Object Detection Techniques