A Fair Ranking and New Model for Panoptic Scene Graph Generation
Julian Lorenz, Alexander Pest, Daniel Kienzle, Katja Ludwig, Rainer, Lienhart

TL;DR
This paper corrects evaluation flaws in panoptic scene graph generation, introduces a new two-stage model called DSFormer that outperforms existing models, and emphasizes the importance of proper evaluation protocols.
Contribution
It provides a fair evaluation protocol for PSGG, demonstrates the competitiveness of two-stage methods, and introduces the DSFormer model that achieves state-of-the-art results.
Findings
Existing evaluation protocols can be exploited to inflate scores.
Two-stage methods are competitive with one-stage methods when evaluated fairly.
DSFormer significantly outperforms previous models on the corrected benchmark.
Abstract
In panoptic scene graph generation (PSGG), models retrieve interactions between objects in an image which are grounded by panoptic segmentation masks. Previous evaluations on panoptic scene graphs have been subject to an erroneous evaluation protocol where multiple masks for the same object can lead to multiple relation distributions per mask-mask pair. This can be exploited to increase the final score. We correct this flaw and provide a fair ranking over a wide range of existing PSGG models. The observed scores for existing methods increase by up to 7.4 mR@50 for all two-stage methods, while dropping by up to 19.3 mR@50 for all one-stage methods, highlighting the importance of a correct evaluation. Contrary to recent publications, we show that existing two-stage methods are competitive to one-stage methods. Building on this, we introduce the Decoupled SceneFormer (DSFormer), a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Artificial Intelligence in Games
