TL;DR
This paper investigates how text-based entity vector spaces capture structural regularities like organizational committees, co-author networks, and academic ranks, comparing various unsupervised embedding methods.
Contribution
It systematically analyzes the ability of different unsupervised text embedding methods to encode structural entity relationships and hierarchies.
Findings
Neural embedding methods outperform others in clustering experts.
SERT best encodes entity relations.
Neural methods like doc2vec and SERT outperform LSI, LDA, and word2vec.
Abstract
Entity retrieval is the task of finding entities such as people or products in response to a query, based solely on the textual documents they are associated with. Recent semantic entity retrieval algorithms represent queries and experts in finite-dimensional vector spaces, where both are constructed from text sequences. We investigate entity vector spaces and the degree to which they capture structural regularities. Such vector spaces are constructed in an unsupervised manner without explicit information about structural aspects. For concreteness, we address these questions for a specific type of entity: experts in the context of expert finding. We discover how clusterings of experts correspond to committees in organizations, the ability of expert representations to encode the co-author graph, and the degree to which they encode academic rank. We compare latent, continuous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Discriminant Analysis
