Application of an Externally Developed Algorithm to Identify Research Cases and Controls from EHR Data: Trials and Triumphs
Nelly Estefanie Garduno-Rapp, Simone Herzberg, Henry H. Ong, Cindy Kao, Christoph U. Lehmann, Srushti Gangireddy, Nitin B Jain, Ayush Giri

TL;DR
Researchers successfully applied and validated an EHR-based algorithm to identify cases and controls for a rotator cuff tear study across different medical centers.
Contribution
The study demonstrates successful cross-center implementation of a rule-based algorithm for EHR data classification and highlights the importance of standardization.
Findings
The algorithm correctly classified 80.9% of patients initially, with improved sensitivity (0.94) and specificity (0.76) after refinement.
The process revealed 12 data entry errors in the gold standard dataset.
Code variability between centers necessitated algorithm refinement for better performance.
Abstract
The use of electronic health records (EHRs) in research demands robust and interoperable systems. By linking biorepositories to EHR algorithms, researchers can efficiently identify cases and controls for large observational studies (e.g., genome-wide association studies). This is critical for ensuring efficient and cost-effective research. However, the lack of standardized metadata and algorithms across different EHRs complicates their sharing and application. Our study presents an example of a successful implementation and validation process. This study aimed to implement and validate a rule-based algorithm from a tertiary medical center in Tennessee to classify cases and controls from a research study on rotator cuff tear (RCT) nested within a tertiary medical center in North Texas and to assess the algorithm's performance. We applied a phenotypic algorithm (designed and validated…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Machine Learning in Healthcare · Medical Coding and Health Information
