Beyond Word Embeddings: Learning Entity and Concept Representations from Large Scale Knowledge Bases
Walid Shalaby, Wlodek Zadrozny, and Hongxia Jin

TL;DR
This paper introduces a novel method for learning entity and concept representations by integrating large-scale knowledge bases, achieving state-of-the-art results in analogy reasoning and concept categorization tasks.
Contribution
It proposes a simple, effective technique to combine Wikipedia and Probase knowledge bases into a unified embedding model using skip-gram, improving concept representation quality.
Findings
Achieved 91% accuracy on semantic analogies
Attained 100% and 98% accuracy on concept categorization datasets
Demonstrated effective unsupervised argument type identification
Abstract
Text representations using neural word embeddings have proven effective in many NLP applications. Recent researches adapt the traditional word embedding models to learn vectors of multiword expressions (concepts/entities). However, these methods are limited to textual knowledge bases (e.g., Wikipedia). In this paper, we propose a novel and simple technique for integrating the knowledge about concepts from two large scale knowledge bases of different structure (Wikipedia and Probase) in order to learn concept representations. We adapt the efficient skip-gram model to seamlessly learn from the knowledge in Wikipedia text and Probase concept graph. We evaluate our concept embedding models on two tasks: (1) analogical reasoning, where we achieve a state-of-the-art performance of 91% on semantic analogies, (2) concept categorization, where we achieve a state-of-the-art performance on two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
