Improving Pseudo Labels with Global-Local Denoising Framework for Cross-lingual Named Entity Recognition
Zhuojun Ding, Wei Wei, Xiaoye Qu, Dangyang Chen

TL;DR
This paper introduces GLoDe, a framework that denoises pseudo labels using global and local information to improve cross-lingual NER, incorporating language-specific features for better generalization.
Contribution
The paper proposes a novel denoising strategy and emphasizes the importance of target language-specific features in cross-lingual NER.
Findings
GLoDe outperforms state-of-the-art methods on benchmark datasets.
The denoising strategy effectively reduces label noise.
Incorporating language-specific features improves model performance.
Abstract
Cross-lingual named entity recognition (NER) aims to train an NER model for the target language leveraging only labeled source language data and unlabeled target language data. Prior approaches either perform label projection on translated source language data or employ a source model to assign pseudo labels for target language data and train a target model on these pseudo-labeled data to generalize to the target language. However, these automatic labeling procedures inevitably introduce noisy labels, thus leading to a performance drop. In this paper, we propose a Global-Local Denoising framework (GLoDe) for cross-lingual NER. Specifically, GLoDe introduces a progressive denoising strategy to rectify incorrect pseudo labels by leveraging both global and local distribution information in the semantic space. The refined pseudo-labeled target language data significantly improves the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Web Data Mining and Analysis
