# CureSCi Metadata Catalog—Finding and harmonizing studies for secondary analysis of hydroxyurea discontinuation in sickle cell disease

**Authors:** Xin Wu, Jeran Stratford, Karen Kesler, Cataia Ives, Tabitha Hendershot, Barbara Kroner, Ying Qin, Huaqin Pan

PMC · DOI: 10.1371/journal.pone.0309572 · PLOS One · 2025-04-23

## TL;DR

The CureSCi Metadata Catalog helps researchers find and combine data from sickle cell disease studies to better understand why patients stop hydroxyurea treatment.

## Contribution

The CureSCi Metadata Catalog is introduced as a tool to streamline data harmonization and secondary analysis in rare disease research.

## Key findings

- A harmonized dataset from five studies identified factors linked to hydroxyurea discontinuation.
- Female patients, those with blood transfusion history, pain, and SC genotype are more likely to stop hydroxyurea.
- The CureSCi Metadata Catalog provides a template for efficient dataset discovery and analysis in sickle cell disease research.

## Abstract

Sickle cell disease (SCD) is a rare group of inherited red blood cell disorders that affect hemoglobin, resulting in serious multi-system complications. The limited number of patients available to participate in research studies can inhibit investigating sophisticated relationships. Secondary analysis is a research method that involves using existing data to answer new research questions. Data harmonization enables secondary analysis by combining data across studies, especially helpful for rare disease research where individual studies may be small. The National Heart, Lung, and Blood Institute Cure Sickle Cell Initiative (CureSCi) Metadata Catalog is a web-based tool to identify SCD study datasets for conducting data harmonization and secondary analysis. We present a proof-of-concept secondary analysis to explore factors associated with discontinuation of hydroxyurea, a safe and effective first line SCD therapy, to illustrate the utility of the CureSCi Metadata Catalog to expedite and enable more robust SCD research.

We performed secondary analysis of SCD studies using a multi-step workflow: develop research questions, identify study datasets, identify variables of interest, harmonize variables, and establish an analysis method. A harmonized dataset consisting of eight predictor variables across five studies was created. Secondary analysis employed a generalized linear model to identify factors that significantly impact hydroxyurea discontinuation.

The CureSCi Metadata Catalog provided a platform to efficiently find relevant studies and design a harmonization strategy to prepare data for secondary analysis. Multivariate analysis of the harmonized data identified that patients who were female, had a history of blood transfusion therapy, experiencing pain, and had the SC sickle cell genotype are more likely to stop hydroxyurea treatment.

This secondary analysis provides a template for how the CureSCi Metadata Catalog expedites dataset discovery of sickle cell studies for identifying relationships between variables or validating existing findings.

## Linked entities

- **Chemicals:** hydroxyurea (PubChem CID 3657)
- **Diseases:** sickle cell disease (MONDO:0011382)

## Full-text entities

- **Diseases:** pain (MESH:D010146), inherited red blood cell disorders (MESH:D029503), SCD (MESH:D000755)
- **Chemicals:** hydroxyurea (MESH:D006918)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12017531/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12017531/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/PMC12017531/full.md

---
Source: https://tomesphere.com/paper/PMC12017531