# Development and Evaluation of a Urinary Na/K Ratio Prediction Model: A Systematic Comparison from Attention-Based Deep Learning to Classical Ensemble Approaches

**Authors:** Emi Yuda, Itaru Kaneko, Daisuke Hirahara

PMC · DOI: 10.3390/bioengineering13020252 · Bioengineering · 2026-02-21

## TL;DR

This study compares machine learning models for predicting a urine sodium-to-potassium ratio using basic health metrics and finds simpler models work best with small datasets.

## Contribution

Demonstrates that simple ensemble models outperform complex deep learning in small-sample clinical prediction tasks.

## Key findings

- Simple averaging of Random Forest, Gradient Boosting, and Linear Regression achieved best performance (MAE = 1.756, R2 = 0.390).
- Attention-based and Transformer models showed worse performance and instability in small-sample settings.
- Equal-weight ensemble integration provided better generalization than adaptive weighting or deep learning.

## Abstract

The urinary sodium-to-potassium (Na/K) ratio is a clinically established predictor of blood pressure and cardiovascular risk. This study aimed to develop and rigorously evaluate machine learning models for estimating the urinary Na/K ratio using four easily obtainable physiological variables: body weight, systolic blood pressure, diastolic blood pressure, and pulse rate. A dataset of 82 participants was analyzed under a nested cross-validation framework to ensure strict generalization assessment. We first designed an attention-based deep learning model (MIDIP: Multi-Integrated Deep Ion Prediction). Although MIDIP showed reduced training error, nested validation revealed performance instability, indicating overfitting in this small-sample setting. We then compared classical machine learning models and ensemble strategies. Among all configurations, simple averaging of Random Forest, Gradient Boosting, and Linear Regression (Group A) achieved the best performance (MAE = 1.756, RMSE = 2.349, R2 = 0.390). In contrast, incorporating a Transformer model (Group B) degraded performance (MAE = 1.855, R2 = 0.294). Similarly, adaptive weighting (AWE) did not improve accuracy (Group A: MAE = 1.836, R2 = 0.266; Group B: MAE = 2.133, R2 = 0.035). These results demonstrate that, under limited sample conditions (N = 82), model simplicity and equal-weight ensemble integration provide superior generalization compared to attention-based or adaptively weighted deep architectures. The findings underscore the importance of strict validation and controlled model complexity when developing clinically applicable prediction models from small datasets.

## Full-text entities

- **Diseases:** obesity (MESH:D009765), MIDIP (MESH:D000081042), injury to (MESH:D014947), gastroenteritis (MESH:D005759), chronic kidney disease (MESH:D051436), hyperaldosteronism (MESH:D006929), hypertension (MESH:D006973)
- **Chemicals:** metal (MESH:D008670), salt (MESH:D012492), K (MESH:D011188), Na (MESH:D012964)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12938205/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12938205/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/PMC12938205/full.md

---
Source: https://tomesphere.com/paper/PMC12938205