# A Maturation-Aware Machine Learning Framework for Screening the Nutritional Status of Adolescents

**Authors:** Hatem Ghouili, Zouhaier Farhani, Narimen Yousfi, Halil İbrahim Ceylan, Amel Dridi, Andrea de Giorgio, Nicola Luigi Bragazzi, Noomen Guelmami, Ismail Dergaa, Anissa Bouassida

PMC · DOI: 10.3390/nu18040660 · 2026-02-17

## TL;DR

This study develops a machine learning model to accurately screen adolescents' nutritional status, accounting for their biological maturation and class imbalance.

## Contribution

A novel cost-sensitive Random Forest model combined with ROSE is proposed to improve underweight detection in adolescents.

## Key findings

- The cost-sensitive Random Forest model achieved high accuracy (0.830) and macro-AUC (0.921) in classifying nutritional status.
- The model showed stable performance across different maturation phases, with optimal discrimination in pre-PHV and post-PHV periods.
- Body mass was the most important predictor, followed by waist circumference and age, especially for underweight classification.

## Abstract

Background: Malnutrition in adolescents remains a significant public health issue worldwide, with undernutrition and overweight often coexisting. Accurate nutritional screening during adolescence is complicated by variability in biological maturation and class imbalance, particularly among underweight adolescents. Objective: This study aims to develop and validate machine learning models for classifying the nutritional status of adolescents, accounting for class imbalance and biological maturation, and to evaluate model stability and variable importance at different stages of peak height velocity (PHV). Methods: In this cross-sectional study, 4232 adolescents aged 11 to 18 years were recruited from nine educational institutions in Tunisia. Their nutritional status was classified according to the International Obesity Task Force (IOTF) BMI thresholds into three categories: underweight (14.4%), normal weight (68.3%), and overweight (17.2%). Ten anthropometric, behavioral, and maturation-related predictors were analyzed. Six supervised machine learning algorithms were evaluated using a 70/30 stratified split between training and test sets, with five-fold cross-validation. Class imbalance was addressed by ROSE combined with cost-sensitive learning. Model performance was assessed using accuracy, Cohen’s kappa coefficient, macro F1 score, sensitivity, specificity, and AUC. Results: The cost-sensitive Random Forest (RF) model achieved the best overall performance, with an accuracy of 0.830, a macro F1 score of 0.767, a macro-AUC of 0.921, and a macro- sensitivity of 0.743. The class-specific sensitivities were 0.70 (underweight), 0.91 (normal weight), and 0.62 (overweight), with no major misclassification between the extreme categories. Performance remained stable across the different maturation phases (accuracy from 0.823 to 0.839), with optimal discrimination in the pre-PHV (macro-AUC = 0.936; sensitivity for underweight = 0.82) and post-PHV (macro-AUC = 0.931) periods. Body mass was the main predictor (importance = 1.00), followed by waist circumference (0.34–0.53). The importance of age for classifying underweight increased significantly from the pre-PHV (0.10) to the post-PHV (0.75) period. A two-stage hierarchical model further improved underweight detection (stage 1 AUC = 0.911; sensitivity = 0.732). Conclusions: A cost-sensitive RF model, combined with ROSE, provides robust classification of adolescents’ nutritional status maturation, significantly improving underweight detection while preserving overall accuracy. This approach is particularly well-suited to public health screening in schools as a first-stage assessment that requires clinical confirmation and promotes a maturation-aware interpretation of nutritional risk among adolescents.

## Full-text entities

- **Diseases:** insulin resistance (MESH:D007333), anxiety (MESH:D001007), Child and adolescent malnutrition (MESH:D015362), malignancies (MESH:D009369), delay (MESH:D006968), inflammatory, renal, gastrointestinal, or systemic diseases (MESH:D018746), PHV (MESH:C000719188), cardiovascular disease (MESH:D002318), stunting (MESH:D006130), dyslipidemia (MESH:D050171), Malnutrition (MESH:D044342), physical disabilities (MESH:D059445), metabolic syndrome (MESH:D024821), injury to (MESH:D014947), muscle mass (MESH:C536030), Overweight (MESH:D050177), ML (MESH:D007859), and puberty (MESH:D011628), wasting (MESH:D019282), Obesity (MESH:D009765), adiposity (MESH:D018205), Underweight (MESH:D013851), weight gain (MESH:D015430), type 2 diabetes (MESH:D003924)
- **Chemicals:** cortisol (MESH:D006854)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12943452/full.md

---
Source: https://tomesphere.com/paper/PMC12943452