An open-source SQL database schema for integrated clinical and translational data management in clinical trials
Umar Niazi, Charlotte Stuart, Patricia Soares, Vincent Foure, Gareth Griffiths

TL;DR
This paper introduces an open-source SQL database schema to manage and integrate clinical and translational data in cancer clinical trials.
Contribution
A novel open-source SQL database schema is proposed for seamless integration of clinical and translational data in academic trial units.
Findings
The schema provides a cost-effective middle ground between raw data and secure data environments.
It enables seamless data sharing and analysis between clinical and molecular data.
Researchers can use R to directly query the database, promoting collaboration and accelerating personalized cancer therapies.
Abstract
Unlocking the power of personalised medicine in oncology hinges on the integration of clinical trial data with translational data (i.e. biospecimen-derived molecular information). This combined analysis allows researchers to tailor treatments to a patient’s unique biological makeup. However, current practices within UK Clinical Trials Units present challenges. While clinical data are held in standardised formats, translational data are complex, diverse, and requires specialised storage. This disparity in format creates significant hurdles for researchers aiming to curate, integrate and analyse these datasets effectively. This article proposes a novel solution: an open-source SQL database schema designed specifically for the needs of academic trial units. Inspired by Cancer Research UK’s commitment to open data sharing and exemplified by the Southampton Clinical Trials Unit’s CONFIRM…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Research Data Management Practices · Radiomics and Machine Learning in Medical Imaging
