PRIME-CVD: A Parametrically Rendered Informatics Medical Environment for Education in Cardiovascular Risk Modelling
Nicholas I-Hsien Kuo, Marzia Hoque Tania, Blanca Gallego, Louisa Jorm

TL;DR
PRIME-CVD introduces a synthetic, parametrically generated medical environment for cardiovascular risk modelling education, enabling realistic, reproducible, and privacy-preserving data analysis training without using real patient data.
Contribution
It presents a novel, openly accessible synthetic dataset environment for cardiovascular risk modelling based on causal graphs, enhancing reproducibility and privacy in medical education.
Findings
Provides two synthetic datasets for cardiovascular risk analysis.
Enables teaching data cleaning, causal reasoning, and risk modelling.
Ensures privacy while maintaining realistic data characteristics.
Abstract
In recent years, progress in medical informatics and machine learning has been accelerated by the availability of openly accessible benchmark datasets. However, patient-level electronic medical record (EMR) data are rarely available for teaching or methodological development due to privacy, governance, and re-identification risks. This has limited reproducibility, transparency, and hands-on training in cardiovascular risk modelling. Here we introduce PRIME-CVD, a parametrically rendered informatics medical environment designed explicitly for medical education. PRIME-CVD comprises two openly accessible synthetic data assets representing a cohort of 50,000 adults undergoing primary prevention for cardiovascular disease. The datasets are generated entirely from a user-specified causal directed acyclic graph parameterised using publicly available Australian population statistics and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · Electronic Health Records Systems
