TL;DR
This paper introduces a noise modeling layer in neural networks to improve low-resource named entity recognition by effectively utilizing noisy, automatically labeled data, resulting in significant performance gains.
Contribution
It proposes a novel noise layer in neural networks that enables training on noisy data alongside clean data, enhancing low-resource NER performance.
Findings
Up to 35% performance improvement in low-resource NER
Effective noise handling improves classifier accuracy
Utilizes automatically annotated noisy data successfully
Abstract
Manually labeled corpora are expensive to create and often not available for low-resource languages or domains. Automatic labeling approaches are an alternative way to obtain labeled data in a quicker and cheaper way. However, these labels often contain more errors which can deteriorate a classifier's performance when trained on this data. We propose a noise layer that is added to a neural network architecture. This allows modeling the noise and train on a combination of clean and noisy data. We show that in a low-resource NER task we can improve performance by up to 35% by using additional, noisy data and handling the noise.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
