'Moving On' -- Investigating Inventors' Ethnic Origins Using Supervised Learning
Matthias Niggli

TL;DR
This paper uses supervised learning with LSTM neural networks to infer inventors' ethnic origins from names, revealing increasing diversity and geographic patterns in global inventor populations over recent decades.
Contribution
It introduces a novel method for estimating inventors' ethnic origins from names using supervised learning, enabling large-scale analysis of inventor diversity.
Findings
Global ethnic diversity among inventors has increased over decades.
Asian-origin inventors have seen a relative rise in representation.
Foreign-origin inventors are especially prevalent in the USA and emerging high-tech fields.
Abstract
Patent data provides rich information about technical inventions, but does not disclose the ethnic origin of inventors. In this paper, I use supervised learning techniques to infer this information. To do so, I construct a dataset of 95'202 labeled names and train an artificial recurrent neural network with long-short-term memory (LSTM) to predict ethnic origins based on names. The trained network achieves an overall performance of 91% across 17 ethnic origins. I use this model to classify and investigate the ethnic origins of 2.68 million inventors and provide novel descriptive evidence regarding their ethnic origin composition over time and across countries and technological fields. The global ethnic origin composition has become more diverse over the last decades, which was mostly due to a relative increase of Asian origin inventors. Furthermore, the prevalence of foreign-origin…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMigration, Ethnicity, and Economy
