Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Sangwoong Yoon, Woo Young Kang, Sungwook Jeon, SeongEun Lee, Changjin, Han, Jonghun Park, Eun-Sol Kim

TL;DR
This paper introduces a novel image retrieval method that leverages scene graph similarity learned through graph neural networks, aligning well with human perception and evaluated on a newly collected dataset.
Contribution
The paper presents a new approach using scene graph similarity with graph neural networks for image retrieval, along with a human-annotated dataset for evaluation.
Findings
Method aligns closely with human perception of image similarity.
Proposed approach outperforms competitive baselines.
New dataset for image relevance evaluation is introduced.
Abstract
As a scene graph compactly summarizes the high-level content of an image in a structured and symbolic manner, the similarity between scene graphs of two images reflects the relevance of their contents. Based on this idea, we propose a novel approach for image-to-image retrieval using scene graph similarity measured by graph neural networks. In our approach, graph neural networks are trained to predict the proxy image relevance measure, computed from human-annotated captions using a pre-trained sentence similarity model. We collect and publish the dataset for image relevance measured by human annotators to evaluate retrieval algorithms. The collected dataset shows that our method agrees well with the human perception of image similarity than other competitive baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques
