Through the PRISm: Importance-Aware Scene Graphs for Image Retrieval
Dimitrios Georgoulopoulos, Nikolaos Chaidos, Angeliki Dimitriou, Giorgos Stamou

TL;DR
PRISm introduces an importance-aware scene graph approach with a novel importance prediction and relational encoding, significantly improving semantic image retrieval accuracy and interpretability.
Contribution
It presents a new multimodal framework combining importance prediction and edge-aware graph neural networks for enhanced semantic image retrieval.
Findings
Achieves superior retrieval performance on benchmark datasets.
Effectively captures key objects and interactions for interpretability.
Outperforms prior methods in semantic alignment with human perception.
Abstract
Accurately retrieving images that are semantically similar remains a fundamental challenge in computer vision, as traditional methods often fail to capture the relational and contextual nuances of a scene. We introduce PRISm (Pruning-based Image Retrieval via Importance Prediction on Semantic Graphs), a multimodal framework that advances image-to-image retrieval through two novel components. First, the Importance Prediction Module identifies and retains the most critical objects and relational triplets within an image while pruning irrelevant elements. Second, the Edge-Aware Graph Neural Network explicitly encodes relational structure and integrates global visual features to produce semantically informed image embeddings. PRISm achieves image retrieval that closely aligns with human perception by explicitly modeling the semantic importance of objects and their interactions, capabilities…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Graph Neural Networks · Advanced Image and Video Retrieval Techniques
