Jointly Embedding Entities and Text with Distant Supervision

Denis Newman-Griffis; Albert M. Lai; Eric Fosler-Lussier

arXiv:1807.03399·cs.CL·July 11, 2018

Jointly Embedding Entities and Text with Distant Supervision

Denis Newman-Griffis, Albert M. Lai, Eric Fosler-Lussier

PDF

2 Repos

TL;DR

This paper introduces a distantly-supervised approach to jointly learn embeddings for entities and text from unannotated corpora, reducing reliance on costly structured resources and improving entity similarity and relatedness representations.

Contribution

The authors propose a novel method for jointly embedding entities and text using only entity-surface form mappings, applicable across domains without needing structured knowledge bases.

Findings

01

Embeddings outperform prior methods in biomedical datasets.

02

New Wikipedia-based dataset demonstrates improved entity similarity.

03

Entities and words encode complementary information for NLP tasks.

Abstract

Learning representations for knowledge base entities and concepts is becoming increasingly important for NLP applications. However, recent entity embedding methods have relied on structured resources that are expensive to create for new domains and corpora. We present a distantly-supervised method for jointly learning embeddings of entities and text from an unnanotated corpus, using only a list of mappings between entities and surface forms. We learn embeddings from open-domain and biomedical corpora, and compare against prior methods that rely on human-annotated text or large knowledge graph structure. Our embeddings capture entity similarity and relatedness better than prior work, both in existing biomedical datasets and a new Wikipedia-based dataset that we release to the community. Results on analogy completion and entity sense disambiguation indicate that entities and words capture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.