# Statistical Accuracy of Administratively Recorded Race/Ethnicity in the Military Health System and Race/Ethnicity Ascertained via Questionnaire

**Authors:** Jordan McAdam, Stephanie A. Richard, Cara H. Olsen, Celia Byrne, Shawn Clausen, Amber Michel, Brian K. Agan, Robert O’Connell, Timothy H. Burgess, David R. Tribble, Simon Pollett, James D. Mancuso, Jennifer A. Rusiecki

PMC · DOI: 10.1007/s40615-025-02351-7 · 2025-03-21

## TL;DR

This study assesses how accurately race and ethnicity data are recorded in the US Military Health System compared to self-reported data, finding significant misclassification and missing data, especially for minority groups.

## Contribution

The study provides a detailed evaluation of race/ethnicity data accuracy in the MHS, highlighting disparities in data quality across different groups.

## Key findings

- Administratively recorded data showed high accuracy for NH White and NH Black groups but lower accuracy for NH AI/AN and NH Other.
- Race/ethnicity data were missing for 63% of dependent beneficiaries, with lower sensitivity but higher PPV compared to active duty/retired groups.
- Misclassification and missing data may bias health disparity analyses and research in the MHS.

## Abstract

Unequal disease burdens such as SARS-CoV-2 infection rates and COVID-19 outcomes across race/ethnicity groups have been reported. Misclassification of and missing race and ethnicity (race/ethnicity) data hinder efforts to identify and address health disparities in the US Military Health System (MHS); therefore, we evaluated the statistical accuracy of administratively recorded race/ethnicity data in the MHS Data Repository (MDR) through comparison to self-reported race/ethnicity collected via questionnaire in the Epidemiology, Immunology, and Clinical Characteristics of Emerging Infectious Diseases with Pandemic Potential (EPICC) cohort study.

The study population included 6009 active duty/retired military (AD/R) and dependent beneficiaries (DB). Considering EPICC study responses the “gold standard,” we calculated sensitivity and positive predictive value (PPV) by race/ethnicity category (non-Hispanic (NH) White, NH Black, Hispanic, NH Asian/Pacific Islander (A/PI), NH American Indian/Alaskan Native (AI/AN), NH Other, missing/unknown).

Among AD/R, the highest sensitivity and PPV values were for NH White (0.93, 0.96), NH Black (0.90, 0.92), Hispanic (0.80, 0.93), and NH A/PI (0.84, 0.95) and lowest for NH AI/AN (0.62, 0.57) and NH Other (0.09, 0.03). The MDR was missing race/ethnicity data for approximately 63% of DB and sensitivity values, though not PPV, were comparatively much lower: NH White (0.35, 0.88), NH Black (0.55, 0.89), Hispanic (0.13, 1.00), and NH A/PI (0.28, 0.84).

Our evaluation of MDR race/ethnicity data revealed misclassification, particularly among some minority groups, and substantial missingness among DB. The potential bias introduced impacts the ability to address health disparities and conduct health research in the MHS, including studies of COVID-19, and needs further examination.

The online version contains supplementary material available at 10.1007/s40615-025-02351-7.

## Linked entities

- **Diseases:** COVID-19 (MONDO:0100096)

## Full-text entities

- **Diseases:** Infectious Diseases (MESH:D003141), COVID-19 (MESH:D000086382)

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12966218/full.md

---
Source: https://tomesphere.com/paper/PMC12966218