Classification of entities via their descriptive sentences
Chao Zhao, Min Zhao, Yi Guan

TL;DR
This paper presents a classification-based approach using CNNs and clustering to identify hypernyms of entities from descriptions, achieving high precision on a large-scale Chinese knowledge base.
Contribution
It introduces a novel classification method combining CNNs and clustering for hypernym identification, improving precision over existing approaches.
Findings
Achieved 99.36% precision on Baidu Baike entities
Successfully classified 1.1 million entities out of 2.1 million
Demonstrated effectiveness of combined CNN and clustering approach
Abstract
Hypernym identification of open-domain entities is crucial for taxonomy construction as well as many higher-level applications. Current methods suffer from either low precision or low recall. To decrease the difficulty of this problem, we adopt a classification-based method. We pre-define a concept taxonomy and classify an entity to one of its leaf concept, based on the name and description information of the entity. A convolutional neural network classifier and a K-means clustering module are adopted for classification. We applied this system to 2.1 million Baidu Baike entities, and 1.1 million of them were successfully identified with a precision of 99.36%.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Natural Language Processing Techniques · Topic Modeling
Methodsk-Means Clustering
