CONNA: Addressing Name Disambiguation on The Fly
Bo Chen, Jing Zhang, Jie Tang, Lingfan Cai, Zhaoyu Wang, Shu Zhao,, Hong Chen, Cuiping Li

TL;DR
CONNA is a reinforcement learning framework for real-time name disambiguation that jointly trains matching and decision components, significantly improving accuracy and successfully deploying in an academic search system.
Contribution
The paper introduces a novel reinforcement learning approach for on-the-fly name disambiguation, jointly training matching and decision modules for improved performance.
Findings
Achieves up to 19.84% improvement in F1-score.
Successfully deployed on a large academic search platform.
Demonstrates effectiveness on two real-world datasets.
Abstract
Name disambiguation is a key and also a very tough problem in many online systems such as social search and academic search. Despite considerable research, a critical issue that has not been systematically studied is disambiguation on the fly -- to complete the disambiguation in the real-time. This is very challenging, as the disambiguation algorithm must be accurate, efficient, and error tolerance. In this paper, we propose a novel framework -- CONNA -- to train a matching component and a decision component jointly via reinforcement learning. The matching component is responsible for finding the top matched candidate for the given paper, and the decision component is responsible for deciding on assigning the top matched person or creating a new person. The two components are intertwined and can be bootstrapped via jointly training. Empirically, we evaluate CONNA on two name…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Privacy-Preserving Technologies in Data
