Interpretable prediction of gross motor coordination in children aged 9–10 using machine learning and SHAP: the influence of physical fitness, basic coordination, and executive function
Lingfeng Mao, Yuan Sui, Xiangyang Ding, Min He, Liqin Deng, Yue Shi, Fei Li

TL;DR
This study uses machine learning to predict gross motor coordination in children aged 9–10, finding that physical fitness and balance are more important than cognitive functions.
Contribution
The study introduces interpretable machine learning models with SHAP to analyze factors influencing gross motor coordination in children.
Findings
Random Forest Regression outperformed traditional regression in predicting gross motor coordination.
Spatial orientation, BMI, and balance were key predictors with nonlinear effects.
Executive function had minimal impact on gross motor coordination.
Abstract
Gross motor coordination is a fundamental component of children’s physical development and motor skill acquisition, closely associated with physical fitness, cognitive function, and overall health. This study aimed to examine the influence of physical fitness, basic coordination, and executive function (EF) on gross motor coordination, and to evaluate the predictive performance of machine learning models compared with traditional multiple linear regression (MLR). A total of 167 children (85 boys and 82 girls), aged 9–10 years, participated in the study. Gross motor coordination was assessed using the Körperkoordinationtest für Kinder (KTK). Physical fitness (e.g., 50 m sprint, standing long jump, sit-ups), basic coordination (e.g., kinesthetic differentiation, spatial orientation, balance), and EF (e.g., inhibitory control, working memory) were measured as predictors. Model performance…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6- —Shanghai Key Lab of Human Performance (Shanghai University of sport)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsChildren's Physical and Motor Development · Motor Control and Adaptation · Cerebral Palsy and Movement Disorders
Introduction
As a fundamental component of motor competence, gross motor coordination (GMC) supports essential large-movement tasks such as running, jumping, and balancing by reflecting the efficient interaction among the musculoskeletal, nervous, and sensory systems, allowing the body to perform precise and balanced movements with minimal energy expenditure (Gallahue & Ozmun, 2006; Latash & Lestienne, 2006). Previous studies have demonstrated that GMC is strongly associated with not only physical activity participation, self-care ability, and physical health, but also emotional regulation, social adaptation, and academic performance (Hulteen et al., 2018; Lopes et al., 2011; Rivilis et al., 2011). Children with lower GMC typically experience greater difficulties with motor tasks, daily activities, and school or social interactions (Cummins, Piek & Dyck, 2005; Fernandes et al., 2016). Moreover, approximately 9–28% of children and adolescents aged 10–12 present with GMC difficulties or related disorders (Tsiotra et al., 2006). During childhood, GMC directly or indirectly affects physical health and shapes long-term health trajectories (Cattuzzo et al., 2016). Accordingly, most experts advocate early interventions to enhance GMC in children. The age of 6–12 are considered a critical window for GMC development (Wade & Whiting, 1986), as delays during this period may hinder the acquisition of motor skills and negatively impact long-term physical and psychological health (Lopes et al., 2011).
The development of GMC in children is shaped by a constellation of factors, including physical fitness components (e.g., speed, strength, and flexibility) (Dos Santos et al., 2018), basic coordination capacities (BCC) such as kinesthetic differentiation, spatial orientation, balance, rhythm, and motor reaction as well as executive function (Silvestri et al., 2025). Although many studies have demonstrated the significance of these factors, their relative contributions to GMC remain inconclusive (D’Hondt et al., 2014; De Chaves et al., 2016; Fernandes et al., 2016; Freitas et al., 2015; Giuriato et al., 2021; Sui et al., 2024). Importantly, GMC development is not a linear process but rather has been shown to emerge from dynamic and multifactorial interactions, as studies have found that body mass index (BMI) shows a U-shaped relationship with GMC, and that the effects of spatial orientation and balance exhibit nonlinear patterns depending on ability levels (Dos Santos et al., 2018; Sui et al., 2024). However, most existing studies have relied on traditional statistical methods—correlation analysis, one-way ANOVA, and multiple linear regression (D’Hondt et al., 2013; De Chaves et al., 2016; Giuriato et al., 2021), which generally assume linear and additive relations and are therefore limited in capturing the complex, nonlinear mechanisms underlying GMC. Consequently, linear models may understate or mischaracterize these associations (Schober & Vetter, 2021). To better elucidate the predictive mechanisms of GMC, more flexible, expressive modeling.
In recent years, machine learning (ML) methods have shown demonstrated significant advantages over traditional statistics approaches in handling nonlinear, multivariable, and high-dimensional data. ML has been widely applied in domains such as health behavior prediction, early risk identification, and personalized intervention (Ali et al., 2021; Mandorino, Clubb & Lacome, 2024). At the same time, the game theory-based interpretability framework SHapley Additive exPlanations (SHAP) quantifies the contribution of each feature to the model’s predictions, providing both global and local interpretability (Lundberg et al., 2018; Lundberg & Lee, 2017). This makes complex models easier to understand and enhances the transparency and explanatory power of the model. SHAP analysis has already been widely applied in analyzing disease risk factors and running performance (Knechtle et al., 2025; Wang et al., 2021). Nevertheless, in the current research landscape of children’s GMC, the systematic application of machine learning and interpretable modeling techniques remains limited—particularly regarding model comparison and the identification of influential predictors.
To address this gap, the present study drew on data from 167 children aged 9–10 years, including measures of physical fitness, basic coordination capacities, and executive function. We modeled GMC performance using ML algorithms alongside multiple linear regression (MLR) and compared their predictive performance. While also applying SHAP analysis to interpret feature importance and probe potential nonlinear associations. By combining predictive accuracy with interpretability, this study improves GMC modeling in children and supports early identification and individualized interventions for coordination deficits.
Materials and Methods
Participants
Children aged 9–10 years were recruited using a convenience sampling method from three public primary schools in Shanghai, China (see Table 1 for descriptive statistics). Inclusion criteria were: (1) age between 9 and 10 years; (2) physical and cognitive ability to complete all assessments; (3) normal intellectual development; and (4) absence of congenital limb deformities or psychiatric disorders. Exclusion criteria were: (1) missing or abnormal data in any test; (2) limb injuries within 30 days prior to assessment; and (3) diagnosed conditions such as intellectual disability, muscular dystrophy, or cardiovascular disease. In total, 167 children (85 boys, 82 girls) met eligibility and were included, yielding a balanced sex distribution. The study was approved by the Shanghai University of Sport Ethics Committee (Approval No. 102772023RT108). Written informed consent was obtained from guardians, and verbal assent from children.
Table 1: Descriptive statistics of basic information.
Assessment procedures
All assessments were conducted between July and August 2023 in the gymnasiums and outdoor playgrounds of the participating schools. All tests were administered by trained sport-science examiners using standardized protocols, with each item consistently delivered by the same examiner to enhance reliability. To reduce bias from task unfamiliarity, participants received standardized briefings and demonstrations, and completed 2–3 familiarization trials per item to learn the required movement patterns and procedures, thereby improving measurement reliability. On each testing day, participants completed a 15-min standardized warm-up consisting of light jogging, core activation, and dynamic stretching. Subsequently, the following four test categories were administered in a fixed sequence: (1) Gross motor coordination, (2) Physical fitness, (3) Basic coordination capacities, and (4) Executive function. The executive function was assessed using the Behavior Rating Inventory of Executive Function (BRIEF, Parent Form), completed by participants’ primary caregivers. To minimize fatigue-related confounds, each test category was scheduled at least 24 h apart. Within each session, participants rested 3–5 min between subtests, during which perceived fatigue was rated using a subjective fatigue scale. Formal testing resumed only when the fatigue score was below 3. A schematic overview of the Test Procedure is shown in Fig. 1.
Test procedure.
Gross motor coordination assessment
GMC was assessed using the Körperkoordinationstest für Kinder (KTK), a standardized and reliable test battery for children (Kiphard & Schilling, 1974). The KTK consists of four subtests: (1) Walking Backwards: Participants walked backward three times on each of three balance beams with decreasing widths (6, 4.5, and 3 cm; each 3 m long). The number of successful steps (maximum eight steps per trial) taken before stepping off the beam was recorded. (2) Jumping Sideways: Within a 50 cm × 100 cm area, participants performed as many consecutive two-footed lateral jumps over a small obstacle as possible in 15 s. The total number of jumps was recorded. (3) Hopping for Height: Using incrementally stacked foam blocks (each 50 cm × 20 cm × 5 cm), participants attempted single-leg hops. The highest successfully cleared height was recorded. (4) Moving Sideways: Starting on one platform while holding another, participants moved laterally by transferring and stepping onto the platforms as many times as possible within 20 s, with the number of successful transfers recorded. Raw scores from the four subtests were converted to the Motor Quotient (MQ) using age and sex-specific Flemish norms provided in the KTK manual. The MQ served as the primary outcome measure of GMC in this study (Vandorpe et al., 2011).
Physical fitness assessment
Physical fitness was assessed using items from the Youth Fitness International Test (YFIT) (Ortega et al., 2024), supplemented with additional evidence-based tests. The specific protocols are as follows: (1) 50-m sprint was used to evaluate sprint speed and anaerobic power. Timing gates were placed at the start and finish lines on a rubber track, with completion time recorded to the nearest 0.1 s (Hands, 2008). (2) Standing long jump assessed lower-limb explosive strength. Participants performed three maximal forward jumps from a standing position; the longest distance was recorded to the nearest 0.1 cm (Haga, 2009). (3) Sit-up measured muscular endurance. Participants lay on a mat with knees bent at 90°, feet secured, and hands placed on the chest or beside the ears. They performed as many correct repetitions as possible within 1 min; only sit-ups where the scapulae fully lifted off the mat were counted (Vaara et al., 2015). (4) The sit-and-reach test assessed flexibility using a standardized device (model PL-009-14A, Peilin, China). Participants sat with legs extended and pushed a sliding marker forward with their fingertips; the farthest distance reached was recorded to the nearest 0.1 cm. Three trials were performed, and the best result was used for analysis (Vaara et al., 2015). (5) Grip strength was measured using a digital hand dynamometer (Camry EH101, Xiangshan, China), adjusted according to hand size and sex. Grip strength is commonly used as a proxy for overall maximal muscular strength, particularly of the upper limbs. The best of two trials for each hand was recorded to the nearest 0.1 kg (Norman et al., 2011). (6) Rope skipping was used to assess coordination and aerobic capacity. Participants performed as many correct bilateral jumps as possible within 1 min; only successful jumps were counted toward the final score (Zhang et al., 2023).
Basic coordination capacity assessment
Basic motor coordination refers to an individual’s ability to regulate timing, spatial orientation, and force during both simple and complex movements. Six subtests were used to assess this capacity (Sui et al., 2024). These subtests have been shown to possess high reliability and validity, and are particularly well-suited for evaluating BCC in children aged 9–10 years.
(1) Kinesthetic differentiation was evaluated using the target standing long jump test, in which the target was set at 70% of each participant’s maximal jump distance. Participants performed three jumps aiming to land as close as possible to the target, and the mean absolute deviation from the target was recorded (Đolo, Grgantov & Milić, 2019). (2) Spatial orientation was assessed via the numbered medicine ball running test. After three guided practice trials along a predefined route (see Fig. 1) and a 3 min rest, participants responded to verbal cues by sprinting from point B through a photoelectric timing gate toward designated marker cones, then returning to point B. The process was repeated three times, with the final trial requiring a second pass through the gate. Total time was recorded to the nearest 0.01 s (Peker & Vural, 2018). (3) Dynamic balance was assessed using the Y Balance Test. Leg length was measured supine from the anterior superior iliac spine (ASIS) to the most distal point of the medial malleolus, accurate to 0.5 cm. To eliminate the stabilizing effect of footwear, all tests were performed barefoot. During the Y Balance Test, participants stood on the central platform with both hands placed firmly on their hips and balanced on one leg, while reaching with the other leg in three directions—anterior, posteromedial, and posterolateral—to push the reach indicator as far as possible. After each reach, participants returned to the starting position while maintaining balance. Each direction was tested three times, and the maximum reach distance was recorded to the nearest 0.5 cm. The composite score was calculated by summing the maximum distances in all three directions, dividing by leg length, and averaging the scores of both legs for subsequent analysis (Fusco et al., 2020). (4) Static balance was tested under both eyes-open and eyes-closed conditions. Participants stood with hands on hips and lifted one leg 10–20 cm off the ground. Each trial lasted up to 60 s and was terminated if the hands or lifted leg touched the ground, the trunk swayed noticeably, or the support foot shifted. Each leg was tested three times per condition, and the average duration across both legs was used for analysis (Hands, 2008). (5) Rhythm ability was measured using the rhythmic sprint test. After completing two 30 m maximal sprints, participants ran through 11 rhythm rings arranged in a fixed pattern (see Fig. 1). Sprint times were recorded using timing gates. Rhythm score was calculated by subtracting the rhythmic sprint time from the best 30 m sprint time (Peker & Vural, 2018). (6) Motor reaction was assessed using an electronic reaction timer (FYS-II, Zhejiang Psychological Instrument Co., Hangzhou, China). Participants pressed a response button as quickly as possible following a photonic signal. Three trials were conducted, and the fastest response time was recorded to the nearest 0.01 s (Bańkosz, Nawara & Ociepa, 2013; Wang et al., 2006).
Executive function assessment
Executive function (EF) was assessed using the Behavior Rating Inventory of Executive Function (BRIEF), Parent Form (Gioia et al., 2000). This questionnaire, completed by participants’ primary caregivers, consists of 86 items and yields three main scores: the Behavioral Regulation Index (BRI), the Metacognition Index (MI), and the Global Executive Composite (GEC). These indices are derived from eight subscales: Inhibit, Shift, Emotional Control, Initiate, Working Memory, Plan, Organize, and Monitor, which together evaluate core components of executive functioning in children aged 6 to 18 years. These scales index core components of EF in children aged 6–18 years. The Chinese BRIEF has demonstrated solid psychometrics, with test–retest reliability r = 0.68–0.89 and internal consistency α = 0.74–0.96, supporting acceptable temporal stability and internal reliability. Accordingly, the BRIEF is appropriate for assessing EF in school-aged children in China (Qian & Wang, 2007).
Machine learning
Given the potential nonlinear relationships among variables, four machine learning algorithms known for their strong performance in nonlinear regression tasks were selected for model development and comparison: (1) Extreme Gradient Boosting (XGBoost): Regularized gradient-boosted trees, strong on tabular data; (2) Light Gradient Boosting Machine Regression (LGBM) Histogram, leaf-wise boosting—fast and scalable; (3) Random Forest Regression (RFR): An ensemble of bagged trees capturing nonlinearity and interactions; (4) Support Vector Regression (SVR): Margin-based regression; kernels enable nonlinear modeling. These algorithms have demonstrated high predictive accuracy and computational efficiency in handling complex patterns in high-dimensional feature spaces (Gu et al., 2015; Kensert et al., 2018; Mandorino et al., 2021; Paleczek et al., 2024).
Feature selection and data preprocessing
The dataset was randomly split into a training set and a test set at a ratio of 4:1. To address multicollinearity, pairwise Spearman correlation coefficients were calculated among all variables in the training set, and variables with high correlation (R > 0.85) were excluded. The remaining variables were retained for further modeling. Since gender was included as a categorical feature, one-hot encoding was applied (Okada, Ohzeki & Taguchi, 2019). All features were standardized prior to model training. Additionally, Recursive Feature Elimination (RFE) was used in the training set to eliminate redundant variables and retain the 15 most informative features for subsequent training, aiming to reduce dimensionality and improve model performance (Zhao et al., 2021). To ensure the generalizability of the final results and considering that a very small test set can lead to unstable performance estimates, while an overly large test set reduces training data and weakens model development, we reserved 20% of the data for a stable final evaluation and used the remaining 80% exclusively for model building. Therefore, the 4:1 train-test split ratio was adopted to ensure that the test set was untouched and reserved for final evaluation, while the training set was used for feature selection and hyperparameter optimization.
Model development, validation, and interpretation
Randomized search with 5-fold cross-validation was used to optimize the hyperparameters of the XGBoost, LGBM, RFR, and SVR models (Mandorino, Clubb & Lacome, 2024). Model performance on the test set was evaluated and compared using four metrics: coefficient of determination (R^2^), mean absolute error (MAE), and root mean squared error (RMSE). Given the common criticism that machine learning models function as “black boxes,” this study incorporated SHapley Additive exPlanations (SHAP), a game-theoretic approach proposed by Lundberg, Erion & Lee (2018), to enhance model interpretability. SHAP values were used to explain the predictions of the best-performing model and to identify the most influential features contributing to GMC (Lundberg, Erion & Lee, 2018). An overview of the end-to-end analytical pipeline—covering data preprocessing, feature selection and hyperparameter optimization, model training/evaluation, and SHAP-based interpretation—is shown in Fig. 2.
Flowchart of model training, evaluation, and SHAP value analysis.
Multiple linear regression analysis
In addition to the machine learning models, a multiple linear regression analysis using ordinary least squares (OLS) was conducted. After removing highly correlated variables, the remaining features were one-hot encoded and split into training and test sets. The regression model’s performance was evaluated using the same metrics: R^2^, MAE, and RMSE, to assess model fit and prediction error. All analyses were conducted using PyCharm Community Edition 2024.2.2, and the statistical significance level was set at p < 0.05.
Results
Feature selection
This study included 167 children aged 9–10 years (85 boys, 82 girls; height: 1.41 ± 0.74 m; weight: 36.89 ± 9.02 kg; Table 1). Overall GMC levels were comparable between boys and girls (86.63 ± 9.29 vs. 87.68 ± 9.39; Table 2). In terms of physical fitness, boys showed slightly higher performance in standing long jump and grip strength (131.79 ± 22.38 cm; 14.80 ± 2.56 kg), whereas girls performed better on the sit-and-reach (13.43 ± 4.96 cm) and completed slightly more rope-skipping repetitions (123.29 ± 30.76). For basic coordination capacities, girls exhibited superior static balance, with longer durations in the eyes-open and eyes-closed single-leg stance tests (56.91 ± 17.05 s; 16.61 ± 16.13 s). Building on the descriptive profiles, we prepared variables for modeling by computing pairwise Spearman correlations in the training set (full matrix in Table S2; key results in Fig. 3). The BRIEF indices BRI, MI, and GEC were highly correlated with the eight subscales (R > 0.85) and were therefore excluded. Similarly, body weight was removed due to its high correlation with BMI (R = 0.93). All remaining variables were retained for subsequent analyses.
Table 2: Descriptive statistics of included variables.
Spearman correlation (Train Only).
Multiple linear regression results
The results of the multiple linear regression analysis are presented in Table 3. The model demonstrated a moderate level of fit (R^2^ = 0.558; adjusted R^2^ = 0.465), with a SEE of 11.299, a MAE of 4.782, and a RMSE of 6.128.
Table 3: Model fit of multiple linear regression predicting gross motor coordination from physical fitness, basic coordination, and executive function.
No multicollinearity issues were detected, as all variance inflation factor (VIF) values for the included predictors were below 5. Further examination of the regression coefficients (Table 4) revealed that sit-and-reach (β = 1.6697, p = 0.044), spatial orientation ability (β = –1.7148, p = 0.020), and eyes-closed static balance (β = 1.3938, p = 0.050) had statistically significant effects on gross motor coordination.
Table 4: Regression coefficients of multiple linear regression model predicting gross motor coordination from physical fitness, basic coordination, and executive function.
Machine learning analysis results
In this study, model performance was assessed comprehensively by considering higher R^2^ together with lower RMSE, and MAE to ensure both predictive accuracy and reliability of feature interpretation. Among the four models tested, the RFR model achieved the best overall performance (R^2^ = 0.533; RMSE = 6.075; MAE = 4.850). Both XGBoost (R^2^ = 0.430; RMSE = 6.710; MAE = 5.233) and SVR (R^2^ = 0.489; RMSE = 6.360; MAE = 4.682) demonstrated comparable predictive ability and reasonable generalization capacity. In contrast, LGBM showed relatively weaker performance (R^2^ = 0.432; RMSE = 6.695; MAE = 5.525) compared to the other three models (Table 5). To further examine the stability of the RFR model, 5-fold cross-validation was performed, yielding results of R^2^ = 0.353 ± 0.155; RMSE = 7.148 ± 0.928; MAE = 5.586 ± 0.795. Although some fluctuation was observed, the model demonstrated generally good generalization. Given its superior predictive performance, the RFR model was selected for subsequent interpretation of key predictors.
Table 5: Model fit indices of four machine learning models.
Feature importance and variable interpretation
According to the feature contributions derived from the RFR model (Fig. 4), spatial orientation, BMI, rhythm ability, and 50-m sprint performance made negative contributions to gross motor coordination. In contrast, dynamic balance, standing long jump, eyes-closed static balance, rope skipping, sit-and-reach, and eyes-open static balance made positive contributions. In the figure, negative SHAP values represent negative impacts, while positive SHAP values represent positive impacts on gross motor coordination.
Summary plot of the random forest model.
Figure 5 displays the ranked importance of all features. The top ten contributors in the RFR model were, in descending order: spatial orientation ability, BMI, dynamic balance, standing long jump, eyes-closed static balance, rope skipping, sit-and-reach, eyes-open static balance, rhythm ability, and 50-m sprint. Notably, spatial orientation ability and BMI had substantially higher importance scores compared to the remaining variables.
Feature importance plot of the random forest model.
Figure 6 presents SHAP dependence plots for the top five features—spatial orientation, BMI, dynamic balance, standing long jump, and eyes-closed static balance—to explore potential non-linear relationships with gross motor coordination. Overall, spatial orientation, eyes-closed static balance, and BMI exhibited clearly non-linear associations, whereas dynamic balance and standing long jump demonstrated approximately linear patterns.
SHAP dependence plots.
Discussion
This study compared four machine learning models—XGBoost, LGBM, RFR, and SVR—with a traditional multiple linear regression model to investigate the effects of physical fitness, basic coordination capacities, and executive function on GMC in children, aiming to construct an optimal predictive framework. Among all models, the RFR achieved the best predictive performance. We then applied SHAP to the best model for interpretability and identified the top ten contributors to GMC. The findings indicate that physical fitness and basic coordination capacities were the most influential predictors; notably, spatial orientation, eyes-closed static balance, and BMI showed nonlinear relationships with GMC, whereas executive function had a comparatively smaller impact.
The multiple linear regression model achieved an adjusted R^2^ of 0.465, with spatial orientation ability, sit-and-reach, and static balance (eyes closed) showing significant effects on GMC (p < 0.05), consistent with previous findings (De Chaves et al., 2016; Sui et al., 2024). However, compared to the machine learning models, the linear model exhibited relatively larger prediction errors (SEE = 11.299, RMSE = 6.128, MAE = 4.782), suggesting limited generalizability and stability. This may be attributed to constraints such as variable selection, sample characteristics, or the linear model’s inability to capture complex, nonlinear interactions. In contrast, the RFR model outperformed others, yielding a higher R^2^ (0.533) and lower prediction errors (RMSE = 6.075, MAE = 4.850), demonstrating superior predictive performance. These findings suggest that machine learning models—particularly RFR—are better suited to handling high-dimensional and nonlinear data in predicting children’s motor performance. Furthermore, the SHAP-based analysis strengthened the model’s interpretability by providing accurate and reliable evaluations of feature contributions and relative importance.
Compared to the linear regression model, this approach identified additional influential variables. Specifically, spatial orientation, BMI, dynamic balance, standing long jump, and eyes-closed static balance emerged as the top five predictors, highlighting the essential roles of spatial awareness, body composition, lower-limb strength, and both static and dynamic balance in GMC development. As an important indicator of body composition, BMI has been widely recognized as a significant factor affecting GMC and motor competence in children (D’Hondt et al., 2014; Fernandes et al., 2022; Hudson, Ballou & Willoughby, 2021). Excess adiposity increases mechanical load and reduces movement economy, which can constrain coordination; conversely, a healthy weight supports neuromuscular efficiency and movement control, facilitating better coordination (Biino et al., 2023; Fernandes et al., 2022).
Among all predictors, spatial orientation ability showed the strongest impact on GMC and demonstrated a nonlinear association. Previous studies have emphasized the close connection between spatial orientation and rhythm ability, noting their mutually reinforcing roles in sustained physical activity (Sui et al., 2024; Qu, 2020). Rhythm training not only supports spatial perception, postural control, and dynamic balance but also improves multisensory integration, thereby enhancing motor coordination and execution efficiency (Jerzy et al., 2015; Peker & Vural, 2018). Notably, rhythm ability also ranked among the top ten contributing factors to gross motor coordination in this study. Although empirical research specifically addressing the impact of spatial orientation on GMC remains limited, prior evidence suggests significant correlations between spatial orientation, rhythm ability, and motor coordination (Hands, 2008; Jerzy et al., 2015). Therefore, it can be inferred that spatial orientation may be one of the key factors influencing the development of GMC. Systematic training targeting spatial orientation may help enhance children’s body-space awareness, information processing, and performance in complex motor tasks (Boccia et al., 2017). In addition, Dynamic balance, eyes-closed static balance, and standing long jump also demonstrated strong predictive power in the model, reinforcing their importance in maintaining posture and executing coordinated movements. The observed higher predictive value of eyes-closed static balance compared to eyes-open balance may stem from the increased reliance on proprioceptive and vestibular input in the absence of visual cues. These sensory systems are essential for maintaining balance during complex multi-joint and multi-degree-of-freedom movements, which often require precise sensory feedback and tightly coordinated isometric, concentric, and eccentric muscle actions (Dos Santos et al., 2018; Jain et al., 2022). Notably, these variables were not prominent in the linear regression model but showed strong explanatory power in the RFR model, reinforcing the advantage of ML in detecting nonlinear interactions and latent influential features.
Notably, executive function did not emerge as a significant predictor of GMC in this study. Effective engagement of executive function typically requires novel, cognitively demanding tasks that activate the cerebellar–prefrontal circuitry (Ren et al., 2025). In this study, executive function was assessed using the parent-reported BRIEF questionnaire, which, while practical and easy to administer, may not fully capture children’s cognitive regulation during actual movement or task-based scenarios. Subjective assessments can be influenced by parental interpretation bias and fragmented observation, potentially underestimating the true role of executive function. Future studies are encouraged to employ more objective assessments—such as behavioral tasks combined with neuroimaging techniques (e.g., fNIRS or EEG)—to gain deeper insight into the contribution of executive function to GMC in children.
In summary, the findings suggest that physical fitness and basic coordination capacities exert a stronger influence on GMC than executive function, and demonstrate greater predictive value. Therefore, early intervention strategies should focus on enhancing spatial orientation, increasing physical activity to reduce BMI, and improving lower-limb strength as well as both static and dynamic balance. Specific recommendations include using maze navigation and orienteering games to develop spatial awareness and directional judgment; using rhythmic activities such as rope skipping, rhythmic gymnastics, and patterned running to develop children’s sensitivity to movement rhythm and improve their motor coordination and fluidity; and combining dynamic/static stretching and plyometric training to improve muscle strength, flexibility, and postural stability (Ayotte & Corcoran, 2018; Hands, 2008), thereby facilitating overall GMC development.
While this study presents a relatively comprehensive model, several limitations should be noted. The cross-sectional design and omission of contextual/psychosocial variables limit causal inference and interpretability. A modest sample from three Shanghai public schools without school-/sex-stratified anan this study, model performance was assessed comprehensivelylyses may reduce generalizability. EF was measured only by parent-report (BRIEF), which could underestimate EF–GMC associations. Despite these limitations, this study was among the first to integrate physical fitness, basic coordination, and executive function using machine learning techniques to examine the underlying factors contributing to gross motor coordination in children. Future research should consider longitudinal or intervention-based designs to establish causal pathways, expand the variable framework to include ecological and psychological dimensions, and adopt multimodal assessment approaches to improve the validity and generalizability of findings.
Conclusions
This study show that the RFR model outperformed traditional linear regression in predicting children’s GMC, suggesting the presence of complex nonlinear relationships that ML methods capture more effectively. SHAP analysis identified spatial orientation, BMI, dynamic balance, standing long jump, and eyes-closed static balance as the most influential predictors, with spatial orientation, eyes-closed static balance, and BMI showing clear nonlinear effects, whereas EF contributed little. Overall, RFR both improves prediction and identifies actionable targets—namely spatial awareness, balance, lower-limb strength, and healthy body composition.
Supplemental Information
10.7717/peerj.20827/supp-1Supplemental Information 1Dataset.
10.7717/peerj.20827/supp-2Supplemental Information 2Full matrix.
10.7717/peerj.20827/supp-3Supplemental Information 3Light Gradient Boosting Machine Regression.
10.7717/peerj.20827/supp-4Supplemental Information 4Extreme Gradient Boosting.
10.7717/peerj.20827/supp-5Supplemental Information 5Support Vector Regression.
10.7717/peerj.20827/supp-6Supplemental Information 6Random Forest Regression.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Ali MM Paul BK Ahmed K Bui FM Quinn JMW Moni MA Heart disease prediction using supervised machine learning algorithms: performance analysis and comparison Computers in Biology and Medicine 2021136410467210.1016/j.compbiomed.2021.10467234315030 · doi ↗ · pubmed ↗
- 2Ayotte D Jr. Corcoran MP Individualized hydration plans improve performance outcomes for collegiate athletes engaging in in-season training Journal of the International Society of Sports Nutrition 20181512710.1186/s 12970-018-0230-229866199 PMC 5987390 · doi ↗ · pubmed ↗
- 3Bańkosz Z Nawara H Ociepa M Assessment of simple reaction time in badminton players Trends in Sport Sciences 20131205461
- 4Biino V Pellegrini B Zoppirolli C Lanza M Gilli F Giuriato M Schena F Gross motor coordination in relation to weight status: a longitudinal study in children and pre-adolescents Frontiers in Public Health 202311124271210.3389/fpubh.2023.124271238235161 PMC 10792555 · doi ↗ · pubmed ↗
- 5Boccia M Rosella M Vecchione F Tanzilli A Palermo L D’Amico S Guariglia C Piccardi L Enhancing allocentric spatial recall in pre-schoolers through navigational training programme Frontiers in Neuroscience 20171157410.3389/fnins.2017.0057429085278 PMC 5650605 · doi ↗ · pubmed ↗
- 6Cattuzzo MT Dos Santos Henrique R RéAH de Oliveira IS Melo BM de Sousa Moura M de Araújo RC Stodden D Motor competence and health related physical fitness in youth: a systematic review Journal of Science and Medicine in Sport 201619212312910.1016/j.jsams.2014.12.00425554655 · doi ↗ · pubmed ↗
- 7Cummins A Piek JP Dyck MJ Motor coordination, empathy, and social behaviour in school-aged children Developmental Medicine & Child Neurology 20054774374421599186210.1017/s 001216220500085 x · doi ↗ · pubmed ↗
- 8De Chaves RN Bustamante Valdívia A Nevill A Freitas D Tani G Katzmarzyk PT Maia JA Developmental and physical-fitness associations with gross motor coordination problems in Peruvian children Research in Developmental Disabilities 201653–5410711410.1016/j.ridd.2016.01.00326871464 · doi ↗ · pubmed ↗
