Retrouver l'inventeur-auteur : la lev{\'e}e d'homonymies d'autorat entre les brevets et les publications scientifiques
David Reymond (IMSIC), Heman Khouilla (LEAD), Sandrine Wolff (BETA),, Manuel Durand-Barthez (CJM, URFIST Paris)

TL;DR
This paper introduces a method to disambiguate and match inventor-authors across patents and scientific publications using IPC classifications and abstract similarity, achieving a low error rate.
Contribution
It presents a novel disambiguation approach combining IPC data and abstract similarity to identify inventor-authors across patent and publication datasets.
Findings
Error rate of disambiguation below 5%
Successfully matched 2501 authors from 4679 patents
Method validated on EPO patent database
Abstract
Patents and scientific papers provide an essential source for measuring science and technology output, to be used as a basis for the most varied scientometric analyzes. Authors' and inventors' names are the key identifiers to carry out these analyses, which however, run up against the issue of disambiguation. By extension identifying inventors who are also academic authors is a non-trivial challenge. We propose a method using the International Patent Classification (IPC) and the IPCCAT API to assess the degree of similarity of patents and papers abstracts of a given inventor, in order to match both types of documents. The method is developed and manually qualified based on three corpora of patents extracted from the international EPO database Espacenet. Among a set of 4679 patents and 7720 inventors, we obtain 2501 authors. The proposed algorithm solves the general problem of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · scientometrics and bibliometrics research · Web Data Mining and Analysis
MethodsSparse Evolutionary Training
