Developing an interpretable machine learning predictive model of chronic obstructive pulmonary disease by serum PFAS concentration
Xiaomei Shao, Ling Zhang, Yuting Wang, Youmei Ying, Xueqin Chen

TL;DR
This study uses machine learning to predict COPD risk based on blood levels of PFAS chemicals, finding some are protective while others increase risk.
Contribution
Applies interpretable machine learning to PFAS-COPD association and provides a public web-based risk calculator.
Findings
CatBoost model achieved 84% accuracy and 0.89 AUC in predicting COPD from PFAS levels.
PFOS and PFUA were associated with reduced COPD risk, while PFOA and MPAH increased risk.
SHAP analysis clarified variable contributions and individual prediction explanations.
Abstract
Chronic obstructive pulmonary disease (COPD) is a leading cause of morbidity and mortality worldwide, with limited early detection strategies. While previous studies have examined the relationship between per- and polyfluoroalkyl substances (PFAS) and COPD, limited research has applied interpretable machine learning (ML) techniques to this association. We investigated the association between PFAS exposure and COPD risk in 4,450 National Health and Nutrition Examination Survey (NHANES) participants from 2013 to 2018. After excluding missing covariates and extreme PFAS values and applying K-nearest neighbors (KNN) imputation, nine ML models, including CatBoost, were built and evaluated using metrics like accuracy, area under the curve (AUC), sensitivity, and specificity. The best-performing model was further analyzed using partial dependence plots (PDP) and SHapley additive exPlanations…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsChronic Obstructive Pulmonary Disease (COPD) Research · Air Quality and Health Impacts · Respiratory Support and Mechanisms
