Learning from Noisy Labels for Entity-Centric Information Extraction

Wenxuan Zhou; Muhao Chen

arXiv:2104.08656·cs.CL·January 24, 2022

Learning from Noisy Labels for Entity-Centric Information Extraction

Wenxuan Zhou, Muhao Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a co-regularization framework for entity-centric information extraction that mitigates overfitting to noisy labels by using multiple neural models with agreement regularization, improving performance on noisy benchmarks.

Contribution

It proposes a simple yet effective multi-model co-regularization approach that reduces overfitting to noisy labels in information extraction tasks.

Findings

01

Improved extraction accuracy on TACRED and CoNLL03 datasets.

02

Models effectively identify and mitigate the impact of noisy labels.

03

Framework is simple to implement and generalizes well across benchmarks.

Abstract

Recent information extraction approaches have relied on training deep neural models. However, such models can easily overfit noisy labels and suffer from performance degradation. While it is very costly to filter noisy labels in large learning resources, recent studies show that such labels take more training steps to be memorized and are more frequently forgotten than clean labels, therefore are identifiable in training. Motivated by such properties, we propose a simple co-regularization framework for entity-centric information extraction, which consists of several neural models with identical structures but different parameter initialization. These models are jointly optimized with the task-specific losses and are regularized to generate similar predictions based on an agreement loss, which prevents overfitting on noisy labels. Extensive experiments on two widely used but noisy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wzhouad/NLL-IE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Topic Modeling