Multimodal Entity Linking for Tweets

Omar Adjali; Romaric Besan\c{c}on; Olivier Ferret; Herve Le; Borgne; Brigitte Grau

arXiv:2104.03236·cs.IR·April 8, 2021

Multimodal Entity Linking for Tweets

Omar Adjali, Romaric Besan\c{c}on, Olivier Ferret, Herve Le, Borgne, Brigitte Grau

PDF

2 Repos

TL;DR

This paper introduces a new dataset and model for multimodal entity linking on Twitter, combining text and images to improve entity disambiguation in social media content.

Contribution

It presents a fully annotated Twitter dataset for multimodal entity linking and a joint learning model that leverages both textual and visual information.

Findings

01

The model outperforms text-only approaches on the dataset.

02

Visual information significantly improves entity linking accuracy.

03

The dataset enables future research in multimodal entity disambiguation.

Abstract

In many information extraction applications, entity linking (EL) has emerged as a crucial task that allows leveraging information about named entities from a knowledge base. In this paper, we address the task of multimodal entity linking (MEL), an emerging research field in which textual and visual information is used to map an ambiguous mention to an entity in a knowledge base (KB). First, we propose a method for building a fully annotated Twitter dataset for MEL, where entities are defined in a Twitter KB. Then, we propose a model for jointly learning a representation of both mentions and entities from their textual and visual contexts. We demonstrate the effectiveness of the proposed model by evaluating it on the proposed dataset and highlight the importance of leveraging visual information when it is available.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.