2M-NER: Contrastive Learning for Multilingual and Multimodal NER with Language and Modal Fusion
Dongsheng Wang, Xiaoqin Feng, Zeming Liu, Chuan Wang

TL;DR
This paper introduces 2M-NER, a novel contrastive learning model for multilingual and multimodal named entity recognition, leveraging a new dataset with four languages and two modalities to improve entity recognition accuracy.
Contribution
The paper constructs a large-scale multilingual and multimodal NER dataset and proposes a contrastive learning-based model that effectively fuses language and modality features for improved NER performance.
Findings
2M-NER achieves the highest F1 score among tested models.
Multimodal collaboration enhances NER accuracy.
Sentence-level alignment can interfere with NER models.
Abstract
Named entity recognition (NER) is a fundamental task in natural language processing that involves identifying and classifying entities in sentences into pre-defined types. It plays a crucial role in various research fields, including entity linking, question answering, and online product recommendation. Recent studies have shown that incorporating multilingual and multimodal datasets can enhance the effectiveness of NER. This is due to language transfer learning and the presence of shared implicit features across different modalities. However, the lack of a dataset that combines multilingualism and multimodality has hindered research exploring the combination of these two aspects, as multimodality can help NER in multiple languages simultaneously. In this paper, we aim to address a more challenging task: multilingual and multimodal named entity recognition (MMNER), considering its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Speech and dialogue systems
MethodsContrastive Learning
