Empirical Evaluation of Character-Based Model on Neural Named-Entity   Recognition in Indonesian Conversational Texts

Kemal Kurniawan; Samuel Louvan

arXiv:1805.12291·cs.CL·September 20, 2018

Empirical Evaluation of Character-Based Model on Neural Named-Entity Recognition in Indonesian Conversational Texts

Kemal Kurniawan, Samuel Louvan

PDF

1 Repo

TL;DR

This paper empirically evaluates character-based neural models for Indonesian conversational NER, demonstrating their superiority over word-only models, especially in handling OOV words and high OOV rates.

Contribution

It provides the first empirical assessment of character-based models for conversational NER in Indonesian, highlighting their effectiveness in OOV scenarios.

Findings

01

Character models outperform word-only models by up to 4 F1 points.

02

Character models improve OOV case performance by up to 15 F1 points.

03

Character models are robust against high OOV rates.

Abstract

Despite the long history of named-entity recognition (NER) task in the natural language processing community, previous work rarely studied the task on conversational texts. Such texts are challenging because they contain a lot of word variations which increase the number of out-of-vocabulary (OOV) words. The high number of OOV words poses a difficulty for word-based neural models. Meanwhile, there is plenty of evidence to the effectiveness of character-based neural models in mitigating this OOV problem. We report an empirical evaluation of neural sequence labeling models with character embedding to tackle NER task in Indonesian conversational texts. Our experiments show that (1) character models outperform word embedding-only models by up to 4 $F_{1}$ points, (2) character models perform better in OOV cases with an improvement of as high as 15 $F_{1}$ points, and (3) character models are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

akurniawan/pytorch-sequence-tagger
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.