Bayesian Non-Exhaustive Classification for Active Online Name Disambiguation
Baichuan Zhang, Murat Dundar, Mohammad Al Hasan

TL;DR
This paper introduces a Bayesian non-exhaustive classification framework using DPGMM for online name disambiguation, capable of classifying known individuals and discovering new ones in streaming data, with improved accuracy over existing methods.
Contribution
It presents a novel online Bayesian approach with inference algorithms for simultaneous classification and new class discovery in streaming name disambiguation.
Findings
Significantly outperforms existing online disambiguation methods
Effective in identifying new ambiguous persons in streaming data
Interactive version leverages user feedback for improved accuracy
Abstract
The name disambiguation task partitions a collection of records pertaining to a given name, such that there is a one-to-one correspondence between the partitions and a group of people, all sharing that given name. Most existing solutions for this task are proposed for static data. However, more realistic scenarios stipulate emergence of records in a streaming fashion where records may belong to known as well as unknown persons all sharing the same name. This requires a flexible name disambiguation algorithm that can not only classify records of known persons represented in the train- ing data by their existing records but can also identify records of new ambiguous persons with no existing records included in the initial training dataset. Toward achieving this objective, in this paper we propose a Bayesian non-exhaustive classification frame- work for solving online name disambiguation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Biomedical Text Mining and Ontologies
