Reproducibility of COVID-19 pre-prints

Annie Collins; Rohan Alexander

arXiv:2107.10724·stat.AP·March 17, 2022·Scientometrics

Reproducibility of COVID-19 pre-prints

Annie Collins, Rohan Alexander

PDF

Open Access 1 Repo

TL;DR

This study assesses the reproducibility of COVID-19 research pre-prints by analyzing data and code availability markers across major pre-print servers, revealing low levels of open data and code sharing.

Contribution

It provides a systematic analysis of data and code availability in COVID-19 pre-prints, highlighting reproducibility challenges during the pandemic.

Findings

01

75% of arXiv pre-prints lack open data or code

02

67% of bioRxiv pre-prints lack open data or code

03

79% of medRxiv pre-prints lack open data or code

Abstract

To examine the reproducibility of COVID-19 research, we create a dataset of pre-prints posted to arXiv, bioRxiv, and medRxiv between 28 January 2020 and 30 June 2021 that are related to COVID-19. We extract the text from these pre-prints and parse them looking for keyword markers signaling the availability of the data and code underpinning the pre-print. For the pre-prints that are in our sample, we are unable to find markers of either open data or open code for 75 per cent of those on arXiv, 67 per cent of those on bioRxiv, and 79 per cent of those on medRxiv.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anniecollins/reproducibility_markers_in_covid19_preprints
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Academic Publishing and Open Access