Development and external validation of an interpretable machine learning-based model for obesity risk prediction in 2–18-year-old children and adolescents in Beijing and Tangshan
Mei Xue, Shufang Liu, Xiaoqian Zhang, Zhixin Zhang, Wenquan Niu

TL;DR
This study developed and validated a machine learning model to predict childhood obesity risk and identified key predictors like BMI and sleep duration.
Contribution
The novel contribution is an interpretable XGBoost model for childhood obesity prediction with a web-based tool for real-time risk assessment.
Findings
The XGBoost model outperformed logistic regression with an AUROC of 0.875 in external validation.
Nine key predictors of childhood obesity were identified using SHAP analysis, including parental BMI and sleep duration.
A web-based tool was developed to provide individualized risk probabilities and explanations.
Abstract
The multifactorial mechanisms driving childhood obesity, a global public health challenge, are yet to be fully elucidated. We aimed to develop and externally validate three widely applied machine learning models alongside logistic regression in 2–18-year-old children and adolescents in Beijing and Tangshan to predict obesity risk. As a further step, we wanted to interpret the optimised model and translate it into a web-based tool to inform clinical decision-making. We analysed data of 19 024 (training/testing) and 2410 (external validation) children and adolescents from Beijing and Tangshan, respectively. Using a set of factors including demographic, familial, socioeconomic, lifestyle, and perinatal variables, we developed four models (light gradient boosting machine, random forest, eXtreme gradient boosting (XGBoost), and logistic regression) and compared their predictive performance.…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsObesity, Physical Activity, Diet · Artificial Intelligence in Healthcare · Body Composition Measurement Techniques
