A Chinese Corpus for Fine-grained Entity Typing
Chin Lee, Hongliang Dai, Yangqiu Song, Xin Li

TL;DR
This paper introduces a new Chinese dataset for fine-grained entity typing, including 4,800 manually labeled mentions and a categorization into 10 general types, enabling better NLP applications in Chinese.
Contribution
The paper provides the first Chinese fine-grained entity typing dataset with manual annotations and explores neural models and cross-lingual transfer learning for improved performance.
Findings
Neural models achieve promising results on the dataset.
Cross-lingual transfer learning enhances Chinese entity typing.
The dataset facilitates future research in Chinese NLP.
Abstract
Fine-grained entity typing is a challenging task with wide applications. However, most existing datasets for this task are in English. In this paper, we introduce a corpus for Chinese fine-grained entity typing that contains 4,800 mentions manually labeled through crowdsourcing. Each mention is annotated with free-form entity types. To make our dataset useful in more possible scenarios, we also categorize all the fine-grained types into 10 general types. Finally, we conduct experiments with some neural models whose structures are typical in fine-grained entity typing and show how well they perform on our dataset. We also show the possibility of improving Chinese fine-grained entity typing through cross-lingual transfer learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
