Low-Resource Named Entity Recognition with Cross-Lingual,   Character-Level Neural Conditional Random Fields

Ryan Cotterell; Kevin Duh

arXiv:2404.09383·cs.CL·April 16, 2024·65 cites

Low-Resource Named Entity Recognition with Cross-Lingual, Character-Level Neural Conditional Random Fields

Ryan Cotterell, Kevin Duh

PDF

Open Access

TL;DR

This paper introduces a transfer learning approach using character-level neural CRFs that jointly trains high-resource and low-resource languages, significantly improving NER performance in low-resource settings.

Contribution

It proposes a novel joint training scheme for character-level neural CRFs across related languages, enhancing low-resource NER without extensive annotation.

Findings

01

F1 score improves by up to 9.8 points over baseline

02

Joint training enables transfer learning across languages

03

Effective for low-resource language NER tasks

Abstract

Low-resource named entity recognition is still an open problem in NLP. Most state-of-the-art systems require tens of thousands of annotated sentences in order to obtain high performance. However, for most of the world's languages, it is unfeasible to obtain such annotation. In this paper, we present a transfer learning scheme, whereby we train character-level neural CRFs to predict named entities for both high-resource languages and low resource languages jointly. Learning character representations for multiple related languages allows transfer among the languages, improving F1 by up to 9.8 points over a loglinear CRF baseline.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsConditional Random Field