Subgroup Validity in Machine Learning for Echocardiogram Data
Cynthia Feeney, Shane Williams, Benjamin S. Wessler, Michael C. Hughes

TL;DR
This paper investigates the lack of subgroup validity in echocardiogram datasets and models, highlighting the need for better demographic reporting and subgroup-specific analysis to ensure equitable AI performance.
Contribution
The study improves sociodemographic reporting in two datasets and analyzes existing models, revealing significant gaps in subgroup validity and underrepresentation of diverse patient groups.
Findings
Current datasets lack sufficient demographic diversity.
Models show no conclusive validity across subgroups.
More data and subgroup-focused analysis are needed.
Abstract
Echocardiogram datasets enable training deep learning models to automate interpretation of cardiac ultrasound, thereby expanding access to accurate readings of diagnostically-useful images. However, the gender, sex, race, and ethnicity of the patients in these datasets are underreported and subgroup-specific predictive performance is unevaluated. These reporting deficiencies raise concerns about subgroup validity that must be studied and addressed before model deployment. In this paper, we show that current open echocardiogram datasets are unable to assuage subgroup validity concerns. We improve sociodemographic reporting for two datasets: TMED-2 and MIMIC-IV-ECHO. Analysis of six open datasets reveals no consideration of gender-diverse patients and insufficient patient counts for many racial and ethnic groups. We further perform an exploratory subgroup analysis of two published aortic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCardiovascular Function and Risk Factors · Artificial Intelligence in Healthcare and Education · ECG Monitoring and Analysis
