CMNEROne at SemEval-2022 Task 11: Code-Mixed Named Entity Recognition by leveraging multilingual data
Suman Dowlagar, Radhika Mamidi

TL;DR
This paper presents a multilingual approach to code-mixed Named Entity Recognition, achieving significant improvements in F1 score for the SEMEVAL 2022 shared task by leveraging diverse language data.
Contribution
The work introduces a novel multilingual data leveraging technique for code-mixed NER, enhancing performance over baseline models.
Findings
Achieved a weighted F1 score of 0.7044, surpassing the baseline by 6%.
Demonstrated the effectiveness of multilingual data in code-mixed NER.
Improved the state-of-the-art performance on the MultiCoNER dataset.
Abstract
Identifying named entities is, in general, a practical and challenging task in the field of Natural Language Processing. Named Entity Recognition on the code-mixed text is further challenging due to the linguistic complexity resulting from the nature of the mixing. This paper addresses the submission of team CMNEROne to the SEMEVAL 2022 shared task 11 MultiCoNER. The Code-mixed NER task aimed to identify named entities on the code-mixed dataset. Our work consists of Named Entity Recognition (NER) on the code-mixed dataset by leveraging the multilingual data. We achieved a weighted average F1 score of 0.7044, i.e., 6% greater than the baseline.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
