# A method to enable clinical and translational research teams with custom real-world data from electronic health record systems

**Authors:** Thomas R. Campion, Evan T. Sholle, Xiaobo Fuld, Cindy Chen, Marcos A. Davila, Vinay I. Varughese, Curtis L. Cole

PMC · DOI: 10.1017/cts.2025.10230 · Journal of Clinical and Translational Science · 2026-01-02

## TL;DR

This paper describes a method to help research teams use real-world data from electronic health records by creating custom data repositories.

## Contribution

The novel contribution is a scalable solution using existing tools and investigator financial commitment to manage custom real-world data repositories.

## Key findings

- Weill Cornell Medicine launched over 17 custom RDRs across various medical fields from 2013 to 2025.
- Custom RDRs increased academic output and local quality improvement activities.
- RDRs evolved from IT-managed infrastructure to collaborative data partnerships with investigators.

## Abstract

Custom transformations of real-world data (RWD) from electronic health record (EHR) systems are necessary to define study variables describing health and disease statuses differently among physicians in multiple specialties and basic scientists from a variety of disciplines . To increase RWD use, we hypothesized that a solution supporting three workflows – discovery, collection, and analysis – using existing rather than novel tools and requiring financial commitment from investigators would scale to meet the needs of clinical and translational research teams and ensure regulatory compliance at an academic medical center.

Weill Cornell Medicine (WCM) implemented custom research data repositories (RDRs) consisting of i2b2 for discovery, REDCap for collection, and Microsoft SQL Server for analysis. WCM subsidized the central information technology (IT) department to manage RDRs and required investigators to commit $50,000 for RDR startup and $7500 for annual maintenance.

From 2013 through 2025, WCM launched more than 17 custom RDRs for pediatrics, myeloproliferative neoplasms, obstetrics and gynecology, pulmonary and critical care, chronic kidney disease, and ophthalmology among other areas. Custom RDRs enabled academic output (e.g., publications, grants) as well as local quality improvement activities.

Custom RDRs facilitated delivery of fit-for-purpose data sets derived from EHR systems and other RWD sources. Over time, RDRs have evolved from an infrastructure product delivered by central IT to a data partnership between investigators and IT.

Custom RDRs and data partnerships may help increase the use of RWD from EHR and other sources by clinical and translational research teams.

## Linked entities

- **Diseases:** myeloproliferative neoplasms (MONDO:0020076), chronic kidney disease (MONDO:0005300)

## Full-text entities

- **Diseases:** myeloproliferative neoplasms (MESH:D009369), chronic kidney disease (MESH:D051436)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12886558/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12886558/full.md

## References

67 references — full list in the complete paper: https://tomesphere.com/paper/PMC12886558/full.md

---
Source: https://tomesphere.com/paper/PMC12886558