Relationship Analysis of Image-Text Pair in SNS Posts
Takuto Nabeoka, Yijun Duan, Qiang Ma

TL;DR
This paper presents a novel graph-based approach using CLIP embeddings and GCNs to classify image-text pairs in SNS posts into similar or complementary relationships, enhancing understanding of their interactions.
Contribution
It introduces a new graph construction and classification method for image-text relationships in SNS, addressing limitations of previous similarity-only approaches.
Findings
Effective classification of image-text relationships demonstrated
Improved accuracy over baseline methods
Robustness shown on public dataset
Abstract
Social networking services (SNS) contain vast amounts of image-text posts, necessitating effective analysis of their relationships for improved information retrieval. This study addresses the classification of image-text pairs in SNS, overcoming prior limitations in distinguishing relationships beyond similarity. We propose a graph-based method to classify image-text pairs into similar and complementary relationships. Our approach first embeds images and text using CLIP, followed by clustering. Next, we construct an Image-Text Relationship Clustering Line Graph (ITRC-Line Graph), where clusters serve as nodes. Finally, edges and nodes are swapped in a pseudo-graph representation. A Graph Convolutional Network (GCN) then learns node and edge representations, which are fused with the original embeddings for final classification. Experimental results on a publicly available dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Systems and Policies
