Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition
Takuma Kato, Kaori Abe, Hiroki Ouchi, Shumpei Miyawaki, Jun Suzuki,, Kentaro Inui

TL;DR
This paper introduces a method to incorporate label component embeddings into sequence labeling models, improving fine-grained named entity recognition performance, especially for rare labels, demonstrated through experiments on English and Japanese datasets.
Contribution
The paper proposes a novel approach to embed label components and integrate them into models, enhancing sequence labeling accuracy for low-frequency labels.
Findings
Improved NER performance with label component embeddings.
Significant gains for low-frequency labels.
Effective across English and Japanese datasets.
Abstract
In general, the labels used in sequence labeling consist of different types of elements. For example, IOB-format entity labels, such as B-Person and I-Person, can be decomposed into span (B and I) and type information (Person). However, while most sequence labeling models do not consider such label components, the shared components across labels, such as Person, can be beneficial for label prediction. In this work, we propose to integrate label component information as embeddings into models. Through experiments on English and Japanese fine-grained named entity recognition, we demonstrate that the proposed method improves performance, especially for instances with low-frequency labels.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
