MM-Locate-News: Multimodal Focus Location Estimation in News
Golsa Tahmasebzadeh, Eric M\"uller-Budack, Sherzod Hakimov, Ralph, Ewerth

TL;DR
This paper introduces MM-Locate-News, a new multimodal dataset and models for estimating the geographic focus of news articles by combining text and images, improving accuracy over single-modality approaches.
Contribution
The paper presents a novel multimodal dataset and models for focus location estimation, addressing the challenge of integrating text and images in news analysis.
Findings
Multimodal models outperform unimodal models in focus location estimation.
The new dataset enables benchmarking of multimodal geolocation methods.
Experimental results demonstrate improved accuracy with combined modalities.
Abstract
The consumption of news has changed significantly as the Web has become the most influential medium for information. To analyze and contextualize the large amount of news published every day, the geographic focus of an article is an important aspect in order to enable content-based news retrieval. There are methods and datasets for geolocation estimation from text or photos, but they are typically considered as separate tasks. However, the photo might lack geographical cues and text can include multiple locations, making it challenging to recognize the focus location using a single modality. In this paper, a novel dataset called Multimodal Focus Location of News (MM-Locate-News) is introduced. We evaluate state-of-the-art methods on the new benchmark dataset and suggest novel models to predict the focus location of news using both textual and image content. The experimental results show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Video Analysis and Summarization
