Lexicon Infused Phrase Embeddings for Named Entity Resolution

Alexandre Passos; Vineet Kumar; Andrew McCallum

arXiv:1404.5367·cs.CL·April 23, 2014

Lexicon Infused Phrase Embeddings for Named Entity Resolution

Alexandre Passos, Vineet Kumar, Andrew McCallum

PDF

TL;DR

This paper introduces lexicon-infused neural word embeddings that enhance named-entity recognition performance, achieving state-of-the-art results on standard datasets by integrating lexicon information into embedding learning.

Contribution

It presents a novel method for learning word embeddings that incorporate lexicon data and applies these embeddings to improve NER accuracy.

Findings

01

Achieved an F1 score of 90.90 on CoNLL 2003 NER dataset.

02

Outperformed previous public-data-based systems in NER.

03

Matched performance of systems using private industrial data.

Abstract

Most state-of-the-art approaches for named-entity recognition (NER) use semi supervised information in the form of word clusters and lexicons. Recently neural network-based language models have been explored, as they as a byproduct generate highly informative vector representations for words, known as word embeddings. In this paper we present two contributions: a new form of learning word embeddings that can leverage information from relevant lexicons to improve the representations, and the first system to use neural word embeddings to achieve state-of-the-art results on named-entity recognition in both CoNLL and Ontonotes NER. Our system achieves an F1 score of 90.90 on the test set for CoNLL 2003---significantly better than any previous system trained on public data, and matching a system employing massive private industrial query-log data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.