TinyGenius: Intertwining Natural Language Processing with Microtask Crowdsourcing for Scholarly Knowledge Graph Creation
Allard Oelen, Markus Stocker, S\"oren Auer

TL;DR
TinyGenius combines natural language processing with crowdsourcing microtasks to validate and enhance scholarly knowledge graphs, improving accuracy and usefulness for digital libraries.
Contribution
It introduces a novel methodology that integrates NLP and crowdsourcing to validate scholarly knowledge statements, addressing accuracy challenges in automated knowledge graph creation.
Findings
Crowdsourcing improves the quality of NLP-extracted knowledge.
The methodology effectively populates a scholarly knowledge graph.
Explainability of NLP methods supports crowd worker decision-making.
Abstract
As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article content. However, autonomous NLP methods are by far not sufficiently accurate to create a high-quality knowledge graph. Yet quality is crucial for the graph to be useful in practice. We present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. The scholarly context in which the crowd workers operate has multiple challenges. The explainability of the employed NLP methods is crucial to provide context in order to support the decision process of crowd workers. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
