TL;DR
This paper introduces a method to generate datasets from knowledge graphs to evaluate how well pre-trained word and concept embeddings capture various semantic relations, revealing strengths and gaps in current models.
Contribution
It proposes a novel approach for dataset generation from knowledge graphs and provides an analysis of the relational knowledge captured by embeddings.
Findings
Embeddings capture some semantic relations effectively.
Evaluation reveals gaps in relational knowledge.
Method works on both proprietary and public knowledge graphs.
Abstract
Deep learning currently dominates the benchmarks for various NLP tasks and, at the basis of such systems, words are frequently represented as embeddings --vectors in a low dimensional space-- learned from large text corpora and various algorithms have been proposed to learn both word and concept embeddings. One of the claimed benefits of such embeddings is that they capture knowledge about semantic relations. Such embeddings are most often evaluated through tasks such as predicting human-rated similarity and analogy which only test a few, often ill-defined, relations. In this paper, we propose a method for (i) reliably generating word and concept pair datasets for a wide number of relations by using a knowledge graph and (ii) evaluating to what extent pre-trained embeddings capture those relations. We evaluate the approach against a proprietary and a public knowledge graph and analyze…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
