# Accuracy of Race and Ethnicity Data in the Pediatric Electronic Health Record: A Concordance and System Adequacy Study

**Authors:** John D. Cowden, Rachel Drake, Jessi Johnson, Katiana Kelty, Mehwish Ahmed

PMC · DOI: 10.1089/heq.2024.0188 · Health Equity · 2025-05-12

## TL;DR

This study evaluates how accurately race and ethnicity data are recorded in pediatric electronic health records, finding significant inaccuracies that could hinder health equity efforts.

## Contribution

The study introduces a novel approach using the U.S. Census Bureau’s categorization to analyze race and ethnicity identities alone and in combination, revealing limitations in current EHR systems.

## Key findings

- Overall concordance between survey and EHR race/ethnicity data was 74.6%, with significant variation across categories.
- EHRs poorly capture patients with multiple racial or ethnic identities, with only 35% sensitivity for such cases.
- Conventional race and ethnicity analysis fails to represent complex identities, limiting health equity efforts.

## Abstract

Conventional race and ethnicity categories and analysis are reductive and prone to inaccuracy. Because race and ethnicity data validity is essential to health equity efforts, we measured the accuracy of race and ethnicity data in a pediatric electronic health record (EHR) to identify areas for improvement in data collection and use.

Patients and their caregivers reported patient race and ethnicity via in-person survey in four pediatric settings (inpatient, emergency room, urgent care, and primary care). Race and ethnicity data from the EHR were compared with survey data to calculate four measures of EHR data accuracy. The U.S. Census Bureau’s novel categorization scheme was used to analyze racial and ethnic identities “alone” and “in combination” with ≥1 other identity.

Caregivers for 561 patients completed the survey; 116 patients aged ≥12 years completed a patient version. For consolidated race and ethnicity fields, overall concordance between survey and EHR was 74.6%. Concordance differed by race and ethnicity category when alone (Black or African American 96.1%, Hispanic 90.6%, and White 92.5%) and in combination with another category (Black or African American 93.9%, Hispanic 88.6%, and White 84.4%). The EHR had low accuracy for patients with multiple racial or ethnic identities (overall sensitivity 35%). Such patients’ identities were often oversimplified due to EHR design. Using “alone” and “in combination” analysis for race and ethnicity categories allowed all patient identities to be visible across categories, unlike in conventional race and ethnicity analysis.

Identifying and eliminating health disparities depend on accurate race and ethnicity data, but current EHR design provides an unreliable data foundation for needed analyses. Conventional categorization used in race and ethnicity analysis is problematic, hiding identities in a reductive set of groupings. New approaches to validation, categorization, and analysis, as explored in this study, are urgently needed to advance health equity goals.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12270523/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC12270523/full.md

---
Source: https://tomesphere.com/paper/PMC12270523