Uncovering Nonlinear Predictors of Serum Biomarker Uric Acid Using Interpretable Machine Learning in Healthy Men
Chung-Chi Yang, Min-Chung Shen, Zih-Yin Lai, Jyun-Cheng Ke, Ta-Wei Chu, Yung-Jen Chuang

TL;DR
This study uses interpretable machine learning to uncover nonlinear relationships between uric acid and health factors in healthy men, revealing new insights into metabolic thresholds.
Contribution
The study introduces interpretable machine learning to identify nonlinear and threshold-based predictors of uric acid levels that traditional methods miss.
Findings
Waist-to-hip ratio influences uric acid only below a threshold of 0.969.
Creatinine's effect on uric acid becomes significant above 0.97 mg/dL, indicating a renal threshold.
Betel nut exposure shows a complex, non-binary association with uric acid metabolism.
Abstract
Background: Uric acid (UA) is linked to gout, renal dysfunction, and cardiovascular disease. Prior studies often assume linear relationships, potentially oversimplifying physiological complexity. Methods: We analyzed data from 5200 healthy Taiwanese men. Demographic, biochemical, lifestyle, and inflammatory variables were assessed using Pearson correlation, multiple linear regression (MLR), and multivariate adaptive regression splines (MARS), an interpretable machine learning method for detecting nonlinear, threshold-based effects. Results: Pearson correlation showed broad linear associations, whereas MARS identified fewer but more physiologically meaningful predictors. Waist-to-hip ratio (WHR) had a strong threshold effect, influencing UA only below 0.969. Creatinine showed a nonlinear impact, becoming substantial above 0.97 mg/dL, suggesting a renal threshold within the “normal”…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3- —Taoyuan Armed Forces General Hospital
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGout, Hyperuricemia, Uric Acid · Alcohol Consumption and Health Effects · Liver Disease Diagnosis and Treatment
1. Introduction
Uric acid (UA) is the end product of purine metabolism. About 80% of UA is derived from endogenous metabolism of amino acids and nucleic acids, while the remaining 20% originates from dietary sources rich in purine or nucleic acid proteins [1]. UA is primarily excreted through the kidneys into urine, and excessive intake of purine-rich or nucleic acid-rich foods may elevate serum UA levels and place an additional burden on renal function [1]. According to Taiwan’s Health Promotion Administration, Ministry of Health and Welfare, an average of 750 mg of UA is produced daily, with approximately 500 mg excreted by the kidneys [2]. The remainder is eliminated via bile secretion into the colon and ultimately expelled in feces. Hyperuricemia is defined as serum UA ≥7 mg/dL in men and ≥6 mg/dL in women, respectively, and is associated with increased risks of gout, renal stones, and arthritis [2]. In recent years, UA has attracted increasing clinical attention because of its strong association not only with classical outcomes such as gout and nephrolithiasis, but also with metabolic syndrome and cardiovascular diseases [3,4,5]. Elevated UA levels have been linked to obesity, insulin resistance, hypertension, and dyslipidemia, key components of metabolic syndrome, suggesting that UA plays an integral role in metabolic homeostasis and systemic inflammation [3,4]. Furthermore, accumulating evidence indicates that UA contributes to endothelial dysfunction, oxidative stress, and systemic inflammation, all of which are key mechanisms involved in cardiovascular diseases development [4,5]. One comprehensive review highlighted that elevated UA may affect the activity of enzymes such as nitric oxide synthase, adenosine monophosphate kinase, adenosine monophosphate dehydrogenase, and nicotinamide adenine dinucleotide phosphate, contributing to pathological processes involved in cardiovascular diseases [5]. Therefore, UA is increasingly recognized not only as a biomarker of purine metabolism, but also as an active participant in cardiometabolic disease pathogenesis, reinforcing its clinical significance.
An important and unresolved question is which risk factors interact with UA, and how these relationships may vary across different physiological and biochemical states [6,7]. In particular, understanding whether these associations follow linear or nonlinear patterns can offer critical insights into the underlying mechanisms regulating UA metabolism. Such knowledge can deepen our understanding of physiological phenomena and their implications for disease risk and progression. However, most prior studies have relied on traditional statistical methods, which may fail to capture complex and nonlinear interactions among variables [6,7]. In recent years, the rise of artificial intelligence has brought new analytical tools to biomedical research. Machine learning, in particular, excels in modeling the complexity and nonlinearity of large datasets, outperforming traditional statistical approaches like multiple linear regression (MLR) [8]. To our knowledge, there are few studies that have employed machine learning to explore hyperuricemia risk factors, yet the studies treated UA as a binary variable (presence or absence of hyperuricemia), thus limiting their clinical interpretability and failing to elucidate the full spectrum of UA variation [8,9]. Among various machine learning approaches, multivariate adaptive regression splines (MARS) offers some unique advantages. MARS not only accommodates nonlinear associations but also generates interpretable equations, bridging the gap between black-box models and conventional regression techniques. This makes MARS particularly suitable for clinical applications where transparency and interpretability are essential.
In the present study, we employed MARS to analyze data from healthy Taiwanese men, incorporating demographic, biochemical, lifestyle, and inflammatory markers. Our aim was not simply to build a predictive model, but to leverage an interpretable mathematical formula to understand the underlying relationships between UA and its associated factors, especially their nonlinear patterns and biological implications.
2. Materials and Methods
2.1. Participant and Study Design
Participant data collection has been previously reported by our group [10]. The present analysis utilized data from the Taiwan MJ Cohort, an ongoing prospective health screening program managed by the MJ Health Screening Centers in Taiwan [11]. The health examinations capture more than 100 biological indicators, including anthropometric parameters, blood biomarkers, and imaging assessments. Participants also complete a self-administered questionnaire covering personal and family medical history, current health conditions, lifestyle habits, physical activity, sleep, and dietary patterns [12].
This study represents a secondary analysis of de-identified data obtained from the MJ Health Clinics. At the time of their health evaluations, all participants provided broad informed consent permitting the use of anonymized data for future research. The dataset is curated and maintained by the MJ Health Research Foundation, and the analyses were conducted under authorization (Authorization Code: MJHRF2023015A). The interpretations and conclusions of this work are solely those of the authors and do not necessarily reflect the views of the Foundation. Additional methodological details are available in the Foundation’s annual technical report [12].
The study protocol was approved by the Institutional Review Board of Tri-Service General Hospital (IRB No. C202305049). Since no new biological specimens were collected, the study qualified for expedited review and did not require additional informed consent. The study population comprised men aged 20 to 80 years. Participants with a history of cancer or those taking medications for hyperglycemia, hypertension, hyperlipidemia, hyperuricemia, or corticosteroids were excluded. The participant selection process is illustrated in Figure 1.
2.2. Laboratory Tests
On the day of the health examination, experienced nursing staff documented each participant’s medical history, including current medication use, and performed a standardized physical examination. Waist circumference was measured horizontally at the natural waist, and body mass index (BMI) was calculated as weight in kilograms divided by height in meters squared. Blood pressure was assessed on the right arm in a seated position using a standard mercury sphygmomanometer to record both systolic and diastolic values. Physical examinations and blood pressure measurements were conducted in accordance with ISO 9001 standards [13] and CAP-accredited laboratory procedures. Abnormal values were re-assessed, and all instruments were regularly calibrated.
Following a 10-h overnight fast, venous blood samples were obtained for biochemical testing. Plasma was separated within one hour of collection and stored at −30 °C until analysis. Fasting plasma glucose (FPG) was measured with the glucose oxidase method (YSI 203 glucose analyzer, Yellow Springs Instruments, Yellow Springs, OH, USA). Total cholesterol and triglycerides were determined using the dry multilayer analytical slide method on the Fuji Dri-Chem 3000 analyzer (Fuji Photo Film, Tokyo, Japan). Serum high-density lipoprotein cholesterol (HDL-C) and low-density lipoprotein cholesterol (LDL-C) were quantified by enzymatic assay following dextran sulfate precipitation. Urinary microalbumin concentrations were measured using turbidimetry on a Beckman Coulter AU5800 biochemical analyzer (Beckman Coulter Inc., Brea, CA, USA).
Demographic data included information on marital status and whether participants had a spouse. Drinking was calculated as the product of total duration of alcohol consumption, drinking frequency, and alcohol concentration. Similarly, the smoking quantity and betel nut exposure were calculated by multiplying the duration, frequency, and quantity (number of cigarettes or betel nuts consumed). Sports was derived from the product of the duration, frequency, and intensity (by type of physical activity. All these parameters were treated as independent variables in the analysis, with serum UA serving as the dependent variable.
2.3. Traditional Statistics
An independent t-test was used to compare UA levels between different marital status groups. Since sleep duration and education level are ordinal variables, analysis of variance (ANOVA) was employed to compare UA across their respective categories. Simple correlation was conducted to evaluate the relationships between UA and other continuous variables. All statistical analyses were performed using SPSS software version 19.0 (IBM Inc., Armonk, New York, NY, USA).
2.4. Machine Learning Method
In the present study, the MARS technique was employed to analyze the dataset. MARS is a flexible and powerful method for modeling high-dimensional data, utilizing an expansion framework based on product spline basis functions. Importantly, the number of basis functions and their associated characteristics are automatically determined through data-driven processes [14]. Conceptually, MARS is aligned with recursive partitioning methods and is similarly capable of capturing complex, higher-order interactions.
For model construction, the dataset was randomly partitioned into training (80%) and testing (20%) subsets. To optimize the MARS model, hyperparameter tuning was performed within the training subset. Specifically, the training data were further stratified into an internal training set and a validation set. A grid search procedure was implemented across predefined ranges of key hyperparameters, including the maximum number of basis functions and the degree of allowed interactions. Model performance was evaluated using the root mean square error (RMSE) on the validation set, and the configuration yielding the lowest RMSE was retained as the optimal MARS specification. This optimized model was subsequently benchmarked against a conventional MLR model for comparative performance assessment.
Prior to performing the machine learning analysis, all data preprocessing and quality checks were completed. In this study, continuous variables were normalized using Z-score standardization, while skewed biochemical parameters (e.g., triglycerides, uric acid) were log-transformed. Robust scaling was considered for variables with extreme values. Given the very low proportion of missing data, cases with missing values were excluded from the analysis.
During model evaluation, predictive performance was quantified using the independent testing subset that had been withheld from the training process. Given that serum UA was modeled as a continuous outcome, multiple complementary error metrics were computed to provide a robust assessment of model accuracy. Specifically, symmetric mean absolute percentage error (SMAPE), relative absolute error (RAE), root relative squared error (RRSE), and root mean square error (RMSE) were calculated. These metrics collectively capture both absolute and relative deviations between observed and predicted values, as well as sensitivity to large errors. A detailed summary of the evaluation results is presented in Table 1.
To provide a comparative context, the averaged performance metrics of the MARS model were used to benchmark its performance against the MLR model. It is noteworthy that both models, MARS and MLR, were trained and tested on the same dataset, ensuring consistency in evaluation.
For 95% confidence interval, we quantified uncertainty in threshold locations by resampling. When MARS produced hinge terms, thresholds were estimated by bootstrap of the MARS model (B = N; median and 2.5–97.5th percentiles). When MARS did not yield hinges, we estimated per-variable breakpoints using univariate segmented regression with multiple starting values; if that failed, we applied a two-piece linear (hinge) grid search with bootstrap. The method used for each variable is indicated in the table.
All statistical analyses and modeling procedures were conducted using R software version 4.0.5 and RStudio version 1.1.453, with all necessary packages installed. The MARS models were implemented using the “earth” package (version 5.3.3) [15], and hyperparameter tuning was conducted via the “caret” package (version 6.0–94) [16]. The MLR models were developed using the base “stats” package in R (version 4.0.5) with default settings.
3. Results
A total of 5200 healthy male participants were included in the final analysis. Their demographic and baseline characteristics are described in detail (Table 2). To explore the relationship between serum UA levels and various demographic, biochemical, and lifestyle variables, Pearson correlation analysis was first conducted. Most variables demonstrated statistically significant associations with UA levels, with the direction of correlation varying across parameters. Notably, LDL-C, plasma phosphorus concentration, alkaline phosphatase, alpha-fetoprotein, carcinoembryonic antigen, homocysteine, fibrinogen, smoking, betel nut exposure, and sports were non-significantly correlated with UA (Table 3). This wide range of significant associations underscores the complexity of the physiological interactions contributing to UA regulation.
To further investigate how social and lifestyle factors influence UA levels, we performed a series of group comparisons. However, only education level showed a significant difference in UA levels, while marital status and sleep duration did not, as determined by t-tests and ANOVA (Table 4). These findings suggest that while metabolic and biochemical markers have measurable correlations with UA, certain social determinants may have limited impact in this healthy male population.
We then compared the performance of two modeling approaches, MLR and MARS, for predicting UA levels. While MARS and MLR showed comparable predictive accuracy, MARS offered substantially greater physiological interpretability by revealing localized, nonlinear effects. (Table 5). This suggests that MARS may offer advantages in handling complex, nonlinear relationships. However, both models explained only a small proportion of the variance (r^2^ ≈ 0.044 for MLR and r^2^ ≈ 0.042 for MARS), underscoring that predictive accuracy was limited despite comparable RMSE values.
We then compared the performance of two modeling approaches, MLR and MARS, for predicting UA levels. Both models performed comparably, with MARS showing a marginally higher RMSE (1.6694 vs. 1.6666) and negligible differences across other metrics (Table 5). This indicates that predictive gains were minimal, and that the principal value of MARS lies in uncovering complex, nonlinear relationships. The MLR equation is expressed as below. Given the standard deviation of UA (1.32), the implied R^2^ was approximately 0.044 for MLR and 0.042 for MARS, indicating that both models explained only about 4% of the variance. Accordingly, we emphasize pattern discovery and physiological interpretation over prediction, and we avoid overstating clinical utility.
In-depth analysis of the final MARS model revealed that only a limited subset of variables contributed substantially to UA prediction. These included waist-to-hip ratio (WHR), creatinine, plasma calcium concentration, high-sensitivity C-reactive protein (Hs-CRP), betel nut exposure (BN), age, γ-glutamyl transferase (γ-GT), FPG, lactate dehydrogenase (LDH), and triglycerides (Table 6). Based on the basis functions in Table 6, the MARS-generated equation for estimating UA is as follows:
A screenshot is provided in the Supplementary Materials Table S1. By coping and pasting the content in the Word file into Excel and type the related factors into the corresponding Excel cells, the result of the equation will be available at A11.
To enhance clinical interpretability, we compared the MARS-derived thresholds against established clinical cut-offs for metabolic syndrome, diabetes, obesity, and kidney dysfunction (Table 7). These thresholds represent inflection points in the UA–predictor relationship within a healthy cohort, indicating changes in slope rather than disease states. They are not diagnostic cut-offs; their role is hypothesis-generating and requires validation in case–control or longitudinal cohorts, since clinical cut-offs are typically defined by comparing healthy and diseased populations.
For example, the creatinine breakpoint at 0.97 mg/dL reflects the point where UA excretion begins to rise disproportionately, despite lying below the conventional abnormal range for renal impairment. Similarly, the WHR threshold (0.969) highlights an inflection point that may mark a subclinical physiological transition rather than a disease state. However, the 95% confidence interval for this WHR threshold was wide (e.g., 0.92–1.01), overlapping with conventional clinical cutoffs, and the apparent “protective” association below this value should not be interpreted as definitive. This pattern may reflect unmeasured confounding—for example, individuals with lower central adiposity may have different dietary patterns (e.g., lower rice or seafood intake) that influence UA exposure. Moreover, the threshold is data-driven and specific to our cohort; it should not be generalized without external validation. Rather than indicating a clinical intervention point, this finding primarily highlights a potential nonlinearity in the relationship between adiposity and UA metabolism that merits further investigation.
Likewise, the fasting glucose threshold (115 mg/dL) suggests a sensitivity zone within a continuous relationship. These values should therefore be interpreted as physiological markers of sensitivity zones within continuous relationships, not as substitutes for established clinical definitions. Nonetheless, their alignment or divergence from guideline cut-offs suggests that such exploratory thresholds may provide mechanistic insights and inform hypotheses for future longitudinal and case–control investigations.
Notably, the influence of these variables was not uniformly linear; rather, each variable demonstrated impact on UA only within specific value ranges, as visualized in Figure 2. For example, the effect of WHR on UA was more pronounced below a certain threshold, while creatinine had a sharply positive effect above a specific cutoff. These localized effects highlight the strength of MARS in identifying biologically meaningful, nonlinear associations that would be overlooked by linear models or traditional correlation analysis.
This discrepancy between the broad statistical significance observed in Pearson correlation and the focused, range-specific associations revealed by MARS emphasizes a key methodological insight. While Pearson correlation treats each variable’s effect as constant across its entire range, MARS accommodates complexity and offers a clearer understanding of which variables truly drive UA variability and under what conditions. A schematic overview summarizing the design, analytical workflow, and key findings of the study is provided in Figure 3.
Figure S1 visualizes the estimated turning points in the association between each predictor and UA. Points denote the median threshold, and horizontal bars (and violins when bootstrapped) show the 95% CI for the threshold location; the estimation method used for each variable is listed in Table S2.
Clear, well-localized thresholds (narrow CIs) were observed for γ-GT, TG, LDH, FPG, and age, indicating distinct inflection points in their relationships with UA. In contrast, variables such as Cr, WHR, CRP, calcium, and betel-nut exposure exhibited broader CIs and/or thresholds close to the boundary of their observed ranges, suggesting weaker evidence for a sharp change in slope. Exact point estimates and 95% CIs for all variables are provided in Supplementary Table S2. Notably, the prevalence of individuals beyond each MARS-identified threshold varied widely, offering important context for interpreting their physiological relevance. For instance, creatinine > 0.97 mg/dL affected only 9.8% of participants—yet this small subgroup exhibited a sharp rise in uric acid, highlighting MARS’s sensitivity to detect nonlinear effects even within the conventional “normal” laboratory range. Similarly, fasting plasma glucose > 115 mg/dL, which lies between the thresholds for prediabetes (≥100 mg/dL) and diabetes (≥126 mg/dL), was observed in just 9.2% of the cohort, suggesting that metabolic dysregulation may influence uric acid metabolism earlier than current clinical definitions imply. In contrast, hs-CRP > 3.38 mg/L was present in 36.0% of participants—closely aligning with the established cardiovascular risk cutoff of 3.0 mg/L—and reinforces systemic inflammation as a key driver of uric acid elevation. Calcium < 9.5 mg/dL was remarkably common, affecting 89.4% of the cohort, indicating that even low–normal calcium levels (still within the standard reference range of 8.5–10.5 mg/dL) are physiologically relevant to uric acid regulation. Finally, **betel nut exposure > 5 units was rare (4.1%), yet it emerged as a significant nonlinear predictor, demonstrating MARS’s ability to uncover complex, non-binary associations even in sparse subgroups. Together, these proportions underscore that MARS identifies both common and infrequent—but biologically meaningful—inflection points that linear models overlook.
4. Discussion
This study applied a MARS approach to identify and characterize nonlinear associations between serum UA levels and a comprehensive set of demographics, biochemical, lifestyle, and inflammatory factors in a large cohort of healthy Taiwanese men. It should be noted that both MLR and MARS achieved low explanatory power (r^2^ < 0.05), consistent with the weak bivariate correlations. Therefore, while the models provide mechanistic and physiological insight, their utility for individual-level prediction remains limited. While traditional Pearson correlation revealed numerous statistically significant relationships, the MARS model uncovered a more refined and physiologically meaningful set of predictors, including WHR, FPG, creatinine, calcium, Hs-CRP, and betel nut exposure, many of which exhibited threshold-dependent effects not captured by conventional linear models. Our findings highlight the importance of using advanced, interpretable machine learning models to reveal complex, range-specific interactions that may underline metabolic regulation. In particular, the identification of nonlinear breakpoints in variables such as WHR and creatinine underscores the need for precision thresholds in both clinical screening and public health strategies. Moreover, the novel associations found for LDH and betel nut exposure provide new directions for future investigation into metabolic and lifestyle determinants of UA regulation.
It is important to note that both MLR and MARS achieved low coefficients of determination (R^2^ = 0.044 and 0.042, respectively), indicating that the included predictors explain only a small proportion of the total variance in UA concentrations. While MARS did not substantially improve predictive accuracy over MLR (Table 5), its principal contribution lies in uncovering biologically plausible, threshold-dependent relationships that linear models inherently cannot detect. This underscores that the goal of this analysis was explanatory insight, not purely predictive performance.
Although statistically significant associations were observed for several predictors, the overall explanatory power of both models was low (R^2^ < 0.05), consistent with the weak bivariate correlations reported in Table 3. This suggests that the majority of variability in UA levels in this cohort is driven by factors not captured in our dataset—such as unmeasured dietary exposures (e.g., seafood, rice), genetic differences in UA metabolism, or temporal variation in exposure. Consequently, the clinical or public health utility of these models for individual-level prediction is limited, and interpretations should focus on population-level associations rather than predictive accuracy.
The prevalence of obesity has increased dramatically in recent years. According to the World Health Organization, global obesity rates have tripled over the past five decades [17]. Numerous studies have demonstrated a strong association between obesity and elevated UA levels. For example, Li et al. reported that Chinese individuals with high UA levels also had significantly higher triglyceride concentrations [18]. Their multivariate logistic regression model identified a significant association between body mass index and UA (β = 0.202, p = 0.039), suggesting that chronic inflammation and oxidative stress associated with obesity may play a mechanistic role [19]. In our study, we selected WHR instead of body mass index as a surrogate marker for obesity. This choice was based on increasing evidence that WHR better reflects central (visceral) adiposity and its metabolic consequences compared to body mass index, which does not distinguish between fat and lean mass or account for fat distribution. WHR emerged as one of the most influential variables in both the Pearson correlation analysis and the MARS model. However, the nature of its association with UA differed substantially between the two methods. Pearson correlation suggested a modest linear relationship between WHR and UA (r = 0.226), implying a uniform increase in UA with increasing WHR. In contrast, the MARS model revealed a pronounced nonlinear, threshold-dependent relationship: WHR significantly influenced UA levels only below a threshold of 0.969, with little to no additional effect observed above this value. Specifically, the basis function Max (0, 0.969–WHR) had the largest absolute coefficient (–3.280) in the MARS model, indicating a sharp decline in UA as WHR increased within the lower range. This implies that the protective effect of low WHR on UA is most prominent below 0.969, and that once WHR exceeds this threshold, its additional impact on UA becomes minimal or flat. This finding is physiologically meaningful. WHR < 0.969 typically represents individuals with relatively low visceral fat accumulation and preserved metabolic homeostasis. In this state, insulin sensitivity remains intact, systemic inflammation is low, and renal UA excretion is likely more efficient. However, as WHR increases beyond this threshold, the metabolic stress from visceral fat accumulation may have already saturated its effect on UA, thereby flattening the curve observed in the MARS model. This threshold phenomenon could not be captured by linear correlation analysis alone and highlights the value of MARS in uncovering nuanced, range-specific relationships.
The second most influential factor associated with serum UA levels in our study was creatinine. A clear positive association was observed between UA and creatinine levels, consistent with findings from previous studies. For instance, Joo et al. demonstrated a dose-dependent relationship between elevated UA and impaired renal function, reporting an adjusted odds ratio of 5.55 (95% CI: 3.27–9.44) for individuals in the lowest quartile of estimated glomerular filtration rate [20]. Several other cross-sectional and longitudinal studies have similarly shown that higher UA levels are associated with progressive decline in renal function [21,22,23,24,25]. The underlying physiological mechanisms linking UA and renal impairment are multifaceted. One key contributor is endothelial dysfunction, which can be induced by elevated serum UA. UA has been shown to inhibit endothelial cell proliferation and reduce the bioavailability of nitric oxide, a critical vasodilator involved in maintaining renal microvascular tone and perfusion [26,27,28]. Reduced NO availability leads to increased vascular resistance and compromised glomerular filtration, thereby contributing to nephron damage. Furthermore, UA may promote oxidative stress and inflammation in renal tissues, exacerbating tubulointerstitial injury and accelerating renal functional decline. As renal function deteriorates, the kidney’s ability to excrete UA diminishes, resulting in further accumulation of UA in the blood. This bi-directional relationship, where UA both contributes to and is affected by renal dysfunction, forms a pathological feedback loop that may explain the strong positive correlation observed in our study. Importantly, the MARS model highlighted a threshold effect, wherein the association between creatinine and UA becomes particularly pronounced above 0.97 mg/dL. This finding suggests that even mild elevations in creatinine, which may still fall within the clinically “normal” range, are associated with disproportionate increases in UA. This reinforces the notion that early renal microvascular changes may already be exerting measurable effects on systemic UA metabolism. Collectively, our findings support a pathophysiological model in which elevated UA not only reflects declining renal clearance but may also act as a contributing factor in the progression of renal dysfunction through mechanisms involving endothelial injury, oxidative stress, and altered hemodynamics.
Calcium plays a vital role in a wide range of cellular functions, including muscle contraction, hormone secretion, nerve conduction, and the activation of numerous enzymes [29]. Both calcium and UA are well-established contributors to the formation of urinary tract stones [30]. However, the relationship between serum UA and calcium levels remains controversial, with prior studies reporting inconsistent or conflicting results [31,32,33,34,35]. In our study, we observed a significant and independent positive correlation between serum UA and calcium levels, as identified by both Pearson correlation and the MARS model. One plausible explanation for this association is the involvement of chronic inflammation. Previous studies have shown that elevated UA is associated with increased levels of pro-inflammatory cytokines such as interleukin-6 and tumor necrosis factor-alpha [36,37,38]. Similarly, hypercalcemia has been linked to heightened inflammatory states, including elevations in C-reactive protein and interleukin-6 levels [39,40]. These parallel findings support the hypothesis that inflammation may act as a common underlying mechanism linking elevated serum levels of UA and calcium. The simultaneous elevation of these markers may thus reflect a shared pathophysiological response to systemic inflammatory burden. Further supporting this hypothesis is the role of Hs-CRP, which emerged as the fourth most influential variable in our MARS model. Hs-CRP is a well-established marker of chronic low-grade inflammation and has been recognized since the 1990s as an independent predictor of cardiovascular events, confirmed by over 25 large-scale epidemiological studies [41]. In parallel, UA has been increasingly recognized not only as a marker of cardiovascular risk but also as a potential pro-inflammatory mediator [42]. For instance, Spiga et al. stratified UA levels into quartiles among 2731 non-diabetic individuals and reported that Hs-CRP levels were significantly higher in the highest UA quartile [42]. Our results are consistent with this literature, further reinforcing the interconnection between elevated UA and systemic inflammation, as reflected by Hs-CRP. In addition to metabolic and inflammatory factors, lifestyle behaviors may also play a role in modulating UA levels. Betel nut exposure, a culturally prevalent practice in Southeast Asia, has been associated with a range of adverse health outcomes. For example, Huang et al. reported a significant association between betel nut exposure and an increased risk of metabolic syndrome [43], while other studies have suggested a possible role in promoting kidney stone formation [44]. Interestingly, a study by Tai et al. found an inverse association between betel nut use and hyperuricemia, with an odds ratio of 0.75 (95% CI: 0.66–0.84) [45]. However, their findings were based on logistic regression, which treats hyperuricemia as a binary outcome, thus limiting interpretation to the presence or absence of disease. In contrast, our use of the MARS model enabled the assessment of continuous, nonlinear associations between betel nut exposure and serum UA levels. This analytical approach revealed a nuanced dose–response relationship, suggesting that betel nut exposure may influence UA metabolism in a non-uniform manner. This novel finding adds depth to the current understanding of lifestyle and UA interactions and highlights the utility of MARS in uncovering complex, range-specific patterns that are not easily captured by traditional models.
The remaining four variables in the MARS model, age, γ-GT, FPG, and LDH, had comparatively smaller coefficients, indicating more modest contributions to serum UA levels. Nonetheless, their associations offer additional physiological insights. Age demonstrated a positive, albeit mild, association with UA. This aligns with findings from Kuzuya et al., who reported a positive longitudinal relationship between age and UA levels in a large cohort of 80,506 individuals of both sexes [46]. The age-related increase in UA may reflect cumulative oxidative stress, decreased renal clearance, or age-associated changes in purine metabolism. Interestingly, our study revealed a positive association between γ-GT and UA, contrary to several earlier studies that reported a negative relationship. Those studies were often conducted in disease-specific populations, such as individuals with diabetes [47], alcohol-related liver disease [48], or patients with metabolic syndrome [49,50]. In contrast, our analysis was performed in a healthy population, suggesting that γ-GT may correlate with UA even in the absence of overt disease. Given γ-GT’s role in glutathione metabolism and oxidative stress response, it is plausible that low-grade oxidative processes contribute to UA elevation even in subclinical states. For fasting plasma glucose, previous research has largely indicated a positive association with UA. However, our analysis supports a modest inverse relationship, in line with recent interventional evidence. Notably, a meta-analysis by Chen et al. involving four clinical trials and 314 patients found that treatment with allopurinol led to significant reductions in FPG (weighted mean difference: −0.61 mmol/L, 95% CI: −0.93 to −0.28) [51]. This observation suggests a potential bi-directional interaction between glucose metabolism and UA, possibly mediated through insulin resistance or oxidative stress pathways. Regarding LDH, existing literature linking this enzyme to UA has been mostly limited to pathological contexts such as preeclampsia [9,52]. LDH, a key enzyme in anaerobic glycolysis, may reflect underlying subclinical tissue turnover or low-grade inflammation, both of which could contribute to increased UA production. This novel finding positions LDH as a potentially underrecognized biomarker in UA regulation, meriting further investigation. Finally, triglycerides were identified as the least influential factor in the MARS model. Although a positive correlation between triglycerides and UA has been widely reported, such as in the small-scale study by Tariq et al. [53], the strength of this association was relatively weak in our analysis. One plausible explanation is shared dietary confounding, particularly high fructose intake. Fructose is known to simultaneously stimulate hepatic UA synthesis and triglyceride-rich lipoprotein production [54,55]. Thus, while triglycerides remain a relevant biomarker in hyperuricemia, its direct mechanistic link to UA may be secondary to underlying metabolic drivers such as diet composition, especially fructose consumption.
This study has several limitations that should be acknowledged. First, it employed a cross-sectional design, which inherently limits the ability to infer causal relationships between variables. Unlike longitudinal studies, this design cannot determine temporal sequences or directionality of associations. Second, the study population consisted exclusively of individuals from a single ethnic group, which may limit the generalizability of the findings. Caution is therefore warranted when extrapolating these results to other ethnic or demographic populations, as genetic, environmental, and cultural factors may influence uric acid metabolism and its associated risk factors. Lastly, it is noteworthy that while uric acid functions as a potent antioxidant in plasma under physiological conditions [1], this protective role is likely negated—or even reversed—in the context of renal dysfunction. As our MARS model highlights, even mild creatinine elevations (0.97 mg/dL), below conventional renal impairment thresholds, are associated with disproportionate UA increases. This suggests that once renal excretory capacity begins to falter, UA transitions from an antioxidant to a pro-oxidant and pro-inflammatory mediator within tissues—particularly in the kidney and vasculature [26,28]. Thus, the clinical implications of elevated UA must be interpreted in the context of renal function: what may be protective in a healthy individual could become pathogenic in early renal stress—a nuance captured by our threshold-based modeling.
5. Conclusions
This study leveraged the MARS model to uncover nonlinear, range-specific predictors of serum uric acid in healthy men, an advancement over traditional linear methods. Key variables such as WHR, creatinine, and hs-CRP showed threshold-dependent effects, offering novel physiological insights. Our findings highlight the model’s potential to enhance metabolic risk assessment through interpretable machine learning.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1El Ridi R. Tallima H. Physiological functions and pathogenic potential of uric acid: A review J. Adv. Res.2017848749310.1016/j.jare.2017.03.00328748115 PMC 5512149 · doi ↗ · pubmed ↗
- 2Lee M.-S. Lin S.-C. Chang H.-Y. Lyu L.-C. Tsai K.-S. Pan W.-H. High prevalence of hyperuricemia in elderly Taiwanese Asia Pac. J. Clin. Nutr.20051428529216169841 · pubmed ↗
- 3Kuwabara M. Kodama T. Ae R. Kanbay M. Andres-Hernando A. Borghi C. Hisatome I. Lanaspa M.A. Update in uric acid, hypertension, and cardiovascular diseases Hypertens. Res.2023461714172610.1038/s 41440-023-01273-337072573 · doi ↗ · pubmed ↗
- 4Agabiti-Rosei E. Grassi G. Beyond gout: Uric acid and cardiovascular diseases Curr. Med Res. Opin.201329(Suppl. 3)333910.1185/03007995.2013.79080423611366 · doi ↗ · pubmed ↗
- 5Sekizuka H. Uric acid, xanthine oxidase, and vascular damage: Potential of xanthine oxidoreductase inhibitors to prevent cardiovascular diseases Hypertens. Res.20224577277410.1038/s 41440-022-00891-735301451 · doi ↗ · pubmed ↗
- 6Li L. Zhang Y. Zeng C. Update on the epidemiology, genetics, and therapeutic options of hyperuricemia Am. J. Transl. Res.2020123167318132774692 PMC 7407685 · pubmed ↗
- 7Ni Q. Lu X. Chen C. Du H. Zhang R. Risk factors for the development of hyperuricemia: A STROBE-compliant cross-sectional and longitudinal study Medicine 201998 e 1759710.1097/MD.000000000001759731626136 PMC 6824661 · doi ↗ · pubmed ↗
- 8Sampa M.B. Hossain N. Hoque R. Islam R. Yokota F. Nishikitani M. Ahmed A. Blood Uric Acid Prediction with Machine Learning: Model Development and Performance Comparison JMIR Public Health Surveill.20208 e 1833110.2196/18331 PMC 758214733030442 · doi ↗ · pubmed ↗
