Pan-disease clustering analysis of the trend of period prevalence

Sneha Jadhav; Chenjin Ma; Yefei Jiang; Ben-Chang Shia; Shuangge Ma

arXiv:1809.06852·stat.AP·September 20, 2018

Pan-disease clustering analysis of the trend of period prevalence

Sneha Jadhav, Chenjin Ma, Yefei Jiang, Ben-Chang Shia, Shuangge Ma

PDF

Open Access

TL;DR

This study introduces a novel clustering method to analyze the joint prevalence trends of multiple diseases over time, revealing meaningful disease groupings from Taiwan's extensive health data.

Contribution

It develops a new penalization pursuit approach for pan-disease clustering of prevalence trends, applied to large-scale national health data.

Findings

01

Identified 35 disease clusters with similar prevalence trends

02

Discovered significant differences from alternative clustering methods

03

Provided interpretable disease groupings with sound clinical relevance

Abstract

For all diseases, prevalence has been carefully studied. In the "classic" paradigm, the prevalence of different diseases has usually been studied separately. Accumulating evidences have shown that diseases can be "correlated". The joint analysis of prevalence of multiple diseases can provide important insights beyond individual-disease analysis, however, has not been well conducted. In this study, we take advantage of the uniquely valuable Taiwan National Health Insurance Research Database (NHIRD), and conduct a pan-disease analysis of period prevalence trend. The goal is to identify clusters within which diseases share similar period prevalence trends. For this purpose, a novel penalization pursuit approach is developed, which has an intuitive formulation and satisfactory properties. In data analysis, the period prevalence values are computed using records on close to 1 million…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData-Driven Disease Surveillance · Bayesian Methods and Mixture Models · Genetic Associations and Epidemiology