A Noise-Robust Loss for Unlabeled Entity Problem in Named Entity Recognition
Wentao Kang, Guijun Zhang, Xiao Fu

TL;DR
This paper introduces NRCES, a noise-robust loss function for distantly supervised NER that effectively handles unlabeled entity noise, improving performance on noisy datasets.
Contribution
The paper proposes NRCES, a novel loss function with a sigmoid component that reduces sensitivity to unlabeled noise in distantly supervised NER.
Findings
NRCES outperforms traditional loss functions on synthetic datasets.
The approach achieves state-of-the-art results on real-world noisy datasets.
NRCES demonstrates strong robustness against severe unlabeled entity problems.
Abstract
Named Entity Recognition (NER) is an important task in natural language processing. However, traditional supervised NER requires large-scale annotated datasets. Distantly supervision is proposed to alleviate the massive demand for datasets, but datasets constructed in this way are extremely noisy and have a serious unlabeled entity problem. The cross entropy (CE) loss function is highly sensitive to unlabeled data, leading to severe performance degradation. As an alternative, we propose a new loss function called NRCES to cope with this problem. A sigmoid term is used to mitigate the negative impact of noise. In addition, we balance the convergence and noise tolerance of the model according to samples and the training process. Experiments on synthetic and real-world datasets demonstrate that our approach shows strong robustness in the case of severe unlabeled entity problem, achieving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
