2M-NER: Contrastive Learning for Multilingual and Multimodal NER with   Language and Modal Fusion

Dongsheng Wang; Xiaoqin Feng; Zeming Liu; Chuan Wang

arXiv:2404.17122·cs.CL·April 29, 2024

2M-NER: Contrastive Learning for Multilingual and Multimodal NER with Language and Modal Fusion

Dongsheng Wang, Xiaoqin Feng, Zeming Liu, Chuan Wang

PDF

Open Access

TL;DR

This paper introduces 2M-NER, a novel contrastive learning model for multilingual and multimodal named entity recognition, leveraging a new dataset with four languages and two modalities to improve entity recognition accuracy.

Contribution

The paper constructs a large-scale multilingual and multimodal NER dataset and proposes a contrastive learning-based model that effectively fuses language and modality features for improved NER performance.

Findings

01

2M-NER achieves the highest F1 score among tested models.

02

Multimodal collaboration enhances NER accuracy.

03

Sentence-level alignment can interfere with NER models.

Abstract

Named entity recognition (NER) is a fundamental task in natural language processing that involves identifying and classifying entities in sentences into pre-defined types. It plays a crucial role in various research fields, including entity linking, question answering, and online product recommendation. Recent studies have shown that incorporating multilingual and multimodal datasets can enhance the effectiveness of NER. This is due to language transfer learning and the presence of shared implicit features across different modalities. However, the lack of a dataset that combines multilingualism and multimodality has hindered research exploring the combination of these two aspects, as multimodality can help NER in multiple languages simultaneously. In this paper, we aim to address a more challenging task: multilingual and multimodal named entity recognition (MMNER), considering its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Speech and dialogue systems

MethodsContrastive Learning