TL;DR
Eigenthemes is a scalable, unsupervised entity linking method that leverages low-rank subspace representations of entities to identify relevant entities without requiring annotated training data.
Contribution
The paper introduces Eigenthemes, a novel unsupervised approach that uses low-rank subspace modeling for entity linking, outperforming some existing methods.
Findings
Effective on multiple benchmark datasets
Outperforms some state-of-the-art methods
Scalable and requires no annotated training data
Abstract
Entity linking is an important problem with many applications. Most previous solutions were designed for settings where annotated training data is available, which is, however, not the case in numerous domains. We propose a light-weight and scalable entity linking method, Eigenthemes, that relies solely on the availability of entity names and a referent knowledge base. Eigenthemes exploits the fact that the entities that are truly mentioned in a document (the "gold entities") tend to form a semantically dense subset of the set of all candidate entities in the document. Geometrically speaking, when representing entities as vectors via some given embedding, the gold entities tend to lie in a low-rank subspace of the full embedding space. Eigenthemes identifies this subspace using the singular value decomposition and scores candidate entities according to their proximity to the subspace.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
