Scholarly outputs of EU Research Funding Programs: Understanding differences between datasets of publications reported by grant holders and OpenAIRE Research Graph in H2020
Alexis-Michel Mugabushaka, Miriam Baglioni, Alessia Bardi, Paolo, Manghi

TL;DR
This study compares two datasets of EU research publications—one reported by grant holders and the OpenAIRE Research Graph—to evaluate their completeness and data quality for monitoring research funding outcomes.
Contribution
It provides a systematic comparison of publicly available datasets, highlighting the strengths of the OpenAIRE Research Graph and offering recommendations for data quality improvements.
Findings
OpenAIRE Research Graph is more comprehensive than grant-reported data.
Data quality varies between sources, affecting monitoring accuracy.
Recommendations for enhancing dataset completeness and reliability.
Abstract
Linking research results to grants is an essential prerequisite for an effective monitoring and evaluation of funding programs. For the EU research funding programs, there are multiple datasets linking scholarly publications to the individual grants, including both open data and those from commercial bibliometric databases. In this paper, we systematically compare openly available data from two data sources: on one hand those reported by the Grant holders (and subsequently published by the European Commission on open data portal) and those from the OpenAIRE Research Graph which collect data from multiple sources. We describe the dataflow leading to their creation and assess the quality of data by validating, on sample basis, the link <project, publications>. We report that, by and large, OpenAIRE Research Graph offers a more complete dataset of scholarly outputs of from EU Research…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Advanced Graph Neural Networks · Scientific Computing and Data Management
