Open Named Entity Modeling from Embedding Distribution
Ying Luo, Hai Zhao, Zhuosheng Zhang, Bingjie Tang

TL;DR
This paper discovers that named entities in word embeddings tend to cluster together, enabling an open, geometric-based model that improves multilingual and resource-poor named entity recognition without relying on fixed dictionaries.
Contribution
It introduces a novel geometric model of named entities as a hypersphere in embedding space, allowing open and multilingual entity modeling beyond traditional dictionary-based methods.
Findings
Named entities cluster together in embedding space.
The hypersphere model enables open, multilingual entity recognition.
Cross-lingual mapping improves resource-poor language NER.
Abstract
In this paper, we report our discovery on named entity distribution in a general word embedding space, which helps an open definition on multilingual named entity definition rather than previous closed and constraint definition on named entities through a named entity dictionary, which is usually derived from human labor and replies on schedule update. Our initial visualization of monolingual word embeddings indicates named entities tend to gather together despite of named entity types and language difference, which enable us to model all named entities using a specific geometric structure inside embedding space, namely, the named entity hypersphere. For monolingual cases, the proposed named entity model gives an open description of diverse named entity types and different languages. For cross-lingual cases, mapping the proposed named entity model provides a novel way to build a named…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
