Clustering country-level all-cause mortality data: a review
Pedro Menezes de Araujo, Isobel Claire Gormley, Thomas Brendan Murphy

TL;DR
This review summarizes how clustering methods are applied to country-level all-cause mortality data, highlighting common approaches, key findings like European divisions, and gaps such as limited data diversity and evaluation of clustering quality.
Contribution
It provides a comprehensive overview of clustering applications in mortality studies, including methodological choices, main findings, and identifies areas needing further research.
Findings
Clustering reveals a persistent East-West European division.
Clustering improves forecast accuracy over single-country models.
Most studies focus on developed countries using common clustering methods.
Abstract
Mortality data are relevant to demography, public health, and actuarial science. Whilst clustering is increasingly used to explore patterns in such data, no study has reviewed its application to country-level all-cause mortality. This review therefore summarises recent work and addresses key questions: why clustering is used, which mortality data are analysed, which methods are most common, and what main findings emerge. To address these questions, we examine studies applying clustering to country-level all-cause mortality, focusing on mortality indices, data sources, and methodological choices, and we replicate some approaches using Human Mortality Database (HMD) data. Our analysis reveals that clustering is mainly motivated by forecasting and by studying convergence and inequality. Most studies use HMD data from developed countries and rely on k-means, hierarchical, or functional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInsurance, Mortality, Demography, Risk Management · Global Maternal and Child Health · Data-Driven Disease Surveillance
