Concordance of Lung Cancer, Melanoma, and Renal Cell Cancer Diagnosis Information Recorded in Health Care Databases in England: Analysis of Linkage Between Primary Care, Hospital Care, and Cancer Registry Data
Paul D. Kruithof, Patrick C. Souverein, Johanna H. M. Driessen, Lizza E. L. Hendriks, Sander Croes, Robin M. J. M. van Geel

TL;DR
This study evaluates how well cancer diagnosis data from different healthcare databases in England align, finding high agreement for lung cancer but longer delays for melanoma and kidney cancer.
Contribution
The study provides a detailed assessment of data linkage quality across primary care, hospital care, and cancer registry databases for three cancer types in England.
Findings
Concordance of cancer diagnosis records between NCRAS, CPRD Aurum, and HES-APC exceeded 70%.
Most lung cancer diagnoses were captured within 3 months of initial diagnosis across datasets.
SACT had significantly fewer matched patients, especially among those over 80 years old.
Abstract
Real‐world evidence (RWE) addresses clinical trial limitations by capturing more representative patient populations and improves evaluation of anticancer treatments, although it becomes available only years after market authorization. As many RWE sources capture only parts of the healthcare continuum, dataset linkage is necessary to enhance data richness. Linkage quality must be assessed to prevent information bias due to incomplete data linkage. We evaluated diagnosis concordance for lung cancer (LC), melanoma, and renal cell cancer (RCC) in England. Patients were matched based on national health service (NHS) number, sex and date of birth. Eligible patients were drawn from the National Cancer Registry and Analysis Service (NCRAS), and matched with three other datasets: Clinical Research Practice Database Aurum (CPRD Aurum), Hospital Episode Statistics Admitted Patient Care (HES‐APC),…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Advanced Causal Inference Techniques · Biomedical Text Mining and Ontologies
