# A Call for Action: Lessons Learned From a Pilot to Share a Complex, Linked COVID-19 Cohort Dataset for Open Science

**Authors:** Clara Amid, Martine Y van Roode, Gabriele Rinck, Janko van Beek, Rory D de Vries, Gijsbert P van Nierop, Eric C M van Gorp, Frank Tobian, Bas B Oude Munnink, Reina S Sikkema, Thomas Jaenisch, Guy Cochrane, Marion P G Koopmans

PMC · DOI: 10.2196/63996 · JMIR Public Health and Surveillance · 2025-02-11

## TL;DR

This paper shares lessons from a pilot project on sharing complex linked data for open science during the COVID-19 pandemic.

## Contribution

The paper introduces a model for sharing linked clinical and genomic data using European open science infrastructure.

## Key findings

- Existing infrastructure enabled rapid development of connected data hubs for outbreak response.
- Barriers exist in sharing complex datasets internationally for open science.
- FAIR principles were essential for linking clinical and genomic data.

## Abstract

The COVID-19 pandemic proved how sharing of genomic sequences in a timely manner, as well as early detection and surveillance of variants and characterization of their clinical impacts, helped to inform public health responses. However, the area of (re)emerging infectious diseases and our global connectivity require interdisciplinary collaborations to happen at local, national and international levels and connecting data to understand the linkages between all factors involved. Here, we describe experiences and lessons learned from a COVID-19 pilot study aimed at developing a model for storage and sharing linked laboratory data and clinical-epidemiological data using European open science infrastructure. We provide insights into the barriers and complexities of internationally sharing linked, complex cohort datasets from opportunistic studies for connected data analyses. An analytical timeline of events, describing key actions and delays in the execution of the pilot, and a critical path, defining steps in the process of internationally sharing a linked cohort dataset are included. The pilot showed how building on existing infrastructure that had previously been developed within the European Nucleotide Archive at the European Molecular Biology Laboratory-European Bioinformatics Institute for pathogen genomics data sharing, allowed the rapid development of connected “data hubs.” These data hubs were required to link human clinical-epidemiological data under controlled access with open high dimensional laboratory data, under FAIR (Findable, Accessible, Interoperable, Reusable) principles. Based on our own experiences, we call for action and make recommendations to support and to improve data sharing for outbreak preparedness and response.

## Linked entities

- **Diseases:** COVID-19 (MONDO:0100096)

## Full-text entities

- **Diseases:** infectious diseases (MESH:D003141), COVID-19 (MESH:D000086382)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11835595/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11835595/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC11835595/full.md

---
Source: https://tomesphere.com/paper/PMC11835595