Towards Realistic Single-Task Continuous Learning Research for NER

Justin Payan; Yuval Merhav; He Xie; Satyapriya Krishna; Anil; Ramakrishna; Mukund Sridhar; Rahul Gupta

arXiv:2110.14694·cs.CL·October 29, 2021

Towards Realistic Single-Task Continuous Learning Research for NER

Justin Payan, Yuval Merhav, He Xie, Satyapriya Krishna, Anil, Ramakrishna, Mukund Sridhar, Rahul Gupta

PDF

1 Repo

TL;DR

This paper addresses the gap in realistic continuous learning benchmarks for Named Entity Recognition by constructing a new dataset, analyzing challenges, and evaluating data rehearsal techniques to improve model accuracy in real-world scenarios.

Contribution

It introduces a new CL NER dataset derived from existing data, discusses challenges of realistic CL, and evaluates data rehearsal as a mitigation strategy.

Findings

01

Constructed a new CL NER dataset for realistic scenarios

02

Identified challenges in applying CL to NER tasks

03

Evaluated effectiveness of data rehearsal in maintaining accuracy

Abstract

There is an increasing interest in continuous learning (CL), as data privacy is becoming a priority for real-world machine learning applications. Meanwhile, there is still a lack of academic NLP benchmarks that are applicable for realistic CL settings, which is a major challenge for the advancement of the field. In this paper we discuss some of the unrealistic data characteristics of public datasets, study the challenges of realistic single-task continuous learning as well as the effectiveness of data rehearsal as a way to mitigate accuracy loss. We construct a CL NER dataset from an existing publicly available dataset and release it along with the code to the research community.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

justinpayan/stackoverflowner-ns
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.