Salience-SGG: Enhancing Unbiased Scene Graph Generation with Iterative Salience Estimation

Runfeng Qu; Ole Hall; Pia K Bideau; Julie Ouerfelli-Ethier; Martin Rolfs; Klaus Obermayer; Olaf Hellwich

arXiv:2601.08728·cs.CV·January 14, 2026

Salience-SGG: Enhancing Unbiased Scene Graph Generation with Iterative Salience Estimation

Runfeng Qu, Ole Hall, Pia K Bideau, Julie Ouerfelli-Ethier, Martin Rolfs, Klaus Obermayer, Olaf Hellwich

PDF

Open Access

TL;DR

Salience-SGG introduces an iterative salience decoder that improves unbiased scene graph generation by focusing on salient spatial structures, enhancing spatial understanding and achieving state-of-the-art results.

Contribution

The paper presents a novel iterative salience decoder and semantic-agnostic salience labels to improve unbiased scene graph generation, especially in spatial understanding.

Findings

01

Achieves state-of-the-art performance on multiple datasets.

02

Improves spatial understanding in unbiased SGG methods.

03

Enhances pairwise localization average precision.

Abstract

Scene Graph Generation (SGG) suffers from a long-tailed distribution, where a few predicate classes dominate while many others are underrepresented, leading to biased models that underperform on rare relations. Unbiased-SGG methods address this issue by implementing debiasing strategies, but often at the cost of spatial understanding, resulting in an over-reliance on semantic priors. We introduce Salience-SGG, a novel framework featuring an Iterative Salience Decoder (ISD) that emphasizes triplets with salient spatial structures. To support this, we propose semantic-agnostic salience labels guiding ISD. Evaluations on Visual Genome, Open Images V6, and GQA-200 show that Salience-SGG achieves state-of-the-art performance and improves existing Unbiased-SGG methods in their spatial understanding as demonstrated by the Pairwise Localization Average Precision

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Visual Attention and Saliency Detection