TL;DR
This paper introduces a multimodal approach combining text and images for POI type prediction, significantly improving accuracy over text-only methods and providing insights into cross-modal interactions.
Contribution
It is the first to incorporate both text and images for POI type prediction, enhancing performance and understanding of multimodal social media data.
Findings
Achieved a macro F1 score of 47.21 across eight categories.
Outperformed state-of-the-art text-only POI prediction methods.
Provided analysis of cross-modal interactions and model limitations.
Abstract
Point-of-interest (POI) type prediction is the task of inferring the type of a place from where a social media post was shared. Inferring a POI's type is useful for studies in computational social science including sociolinguistics, geosemiotics, and cultural geography, and has applications in geosocial networking technologies such as recommendation and visualization systems. Prior efforts in POI type prediction focus solely on text, without taking visual information into account. However in reality, the variety of modalities, as well as their semiotic relationships with one another, shape communication and interactions in social media. This paper presents a study on POI type prediction using multimodal information from text and images available at posting time. For that purpose, we enrich a currently available data set for POI type prediction with the images that accompany the text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
