# Integrating biological and machine learning models for rainbow trout growth: Balancing accuracy and interpretability

**Authors:** Lawrence Fulton, Pin Lyu

PMC · DOI: 10.1371/journal.pone.0336890 · PLOS One · 2026-03-19

## TL;DR

This paper shows how combining biological and machine learning models improves predictions of rainbow trout growth while maintaining interpretability.

## Contribution

The study introduces hybrid frameworks that merge biological and machine learning models for ecological forecasting.

## Key findings

- A stacked ensemble of XGBoost and the von Bertalanffy model achieved the best performance with RMSE of 15.96 mm and R2 of 0.966.
- Hybrid models reduced error by 70–80% compared to baseline models, equivalent to 45–70 mm or 20–32% of mean fish length.
- Feature importance analysis identified initial length, time at large, and weight at release as dominant predictors.

## Abstract

Invasive species management demands predictive models that balance accuracy with ecological interpretability, yet traditional approaches often fail to capture complex environmental interactions. We evaluated hybrid frameworks integrating biological and machine learning models for rainbow trout (Oncorhynchus mykiss) growth in the Lower Colorado River using ten years of tag–recapture data and environmental covariates, comparing traditional and Bayesian von Bertalanffy (VBGM) and Gompertz models with Random Forests, XGBoost, LightGBM, Support Vector Regression, Neural Networks, and ensemble methods through probabilistic performance analysis. Incorporating environmental context and advanced modeling produced substantial gains, with top methods achieving 70–80 percent error reductions relative to baseline models, equivalent to 45–70 mm or 20–32 percent of mean fish length. A stacked ensemble of XGBoost and the VBGM achieved the best performance (RMSE = 15.96 mm, R2=0.966) and exhibited stochastic dominance across the posterior, while gradient boosting models formed a strong second tier, led by LightGBM and XGBoost. Bayesian Model Averaging reached comparable accuracy while explicitly quantifying uncertainty. Even traditional mechanistic models improved by up to 80 percent when enhanced with covariates and Bayesian estimation, preserving biological interpretability through parameters such as asymptotic size and growth rate. Feature importance analysis identified initial length, time at large, and weight at release as dominant predictors, and the stacked ensemble outperformed baseline models in over 99 percent of posterior samples. These results establish hybrid ensemble frameworks as powerful tools for ecological forecasting that unite predictive performance with mechanistic insight, providing a generalizable template for systems where both accuracy and interpretability are required.

## Linked entities

- **Species:** Oncorhynchus mykiss (taxon 8022)

## Full-text entities

- **Diseases:** drought (MESH:C536747), VBGM (MESH:D006130)
- **Chemicals:** cortisol (MESH:D006854), Reactive phosphorus (-), P (MESH:D010758)
- **Species:** Salmo trutta (river trout, species) [taxon 8032], Xyrauchen texanus (razorback sucker, species) [taxon 154827], Oncorhynchus mykiss (rainbow trout, species) [taxon 8022], Gila cypha (humpback chub, species) [taxon 67541], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13001953/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13001953/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/PMC13001953/full.md

---
Source: https://tomesphere.com/paper/PMC13001953