Multimodal Named Entity Recognition for Short Social Media Posts

Seungwhan Moon; Leonardo Neves; Vitor Carvalho

arXiv:1802.07862·cs.CL·February 23, 2018

Multimodal Named Entity Recognition for Short Social Media Posts

Seungwhan Moon, Leonardo Neves, Vitor Carvalho

PDF

TL;DR

This paper introduces a new multimodal NER task for social media posts combining text and images, creating a dataset and a model that leverages visual context to improve entity recognition in noisy, short social media data.

Contribution

The paper presents the first dataset for multimodal NER on social media and a novel model with modality attention that effectively integrates visual context to enhance NER performance.

Findings

01

The multimodal model outperforms text-only NER models significantly.

02

Visual context improves entity recognition accuracy in noisy social media posts.

03

The modality-attention mechanism effectively filters relevant information from multiple modalities.

Abstract

We introduce a new task called Multimodal Named Entity Recognition (MNER) for noisy user-generated data such as tweets or Snapchat captions, which comprise short text with accompanying images. These social media posts often come in inconsistent or incomplete syntax and lexical notations with very limited surrounding textual contexts, bringing significant challenges for NER. To this end, we create a new dataset for MNER called SnapCaptions (Snapchat image-caption pairs submitted to public and crowd-sourced stories with fully annotated named entities). We then build upon the state-of-the-art Bi-LSTM word/character based NER models with 1) a deep image network which incorporates relevant visual context to augment textual information, and 2) a generic modality-attention module which learns to attenuate irrelevant modalities while amplifying the most informative ones to extract contexts…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.