Learning from Noisy Labels for Entity-Centric Information Extraction
Wenxuan Zhou, Muhao Chen

TL;DR
This paper introduces a co-regularization framework for entity-centric information extraction that mitigates overfitting to noisy labels by using multiple neural models with agreement regularization, improving performance on noisy benchmarks.
Contribution
It proposes a simple yet effective multi-model co-regularization approach that reduces overfitting to noisy labels in information extraction tasks.
Findings
Improved extraction accuracy on TACRED and CoNLL03 datasets.
Models effectively identify and mitigate the impact of noisy labels.
Framework is simple to implement and generalizes well across benchmarks.
Abstract
Recent information extraction approaches have relied on training deep neural models. However, such models can easily overfit noisy labels and suffer from performance degradation. While it is very costly to filter noisy labels in large learning resources, recent studies show that such labels take more training steps to be memorized and are more frequently forgotten than clean labels, therefore are identifiable in training. Motivated by such properties, we propose a simple co-regularization framework for entity-centric information extraction, which consists of several neural models with identical structures but different parameter initialization. These models are jointly optimized with the task-specific losses and are regularized to generate similar predictions based on an agreement loss, which prevents overfitting on noisy labels. Extensive experiments on two widely used but noisy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Topic Modeling
