# Seed quality drives grain yield in Ethiopian and Senegalese sorghum: Insights from machine learning

**Authors:** Ezekiel Ahn, Louis K. Prom, Jae Hee Jang, Insuck Baek, Adama R. Tukuli, Seunghyun Lim, Seok Min Hong, Moon S. Kim, Lyndel W. Meinhardt, Sunchung Park, Clint Magill, Nguyen-Thanh Son, Nguyen-Thanh Son, Somashekhar Punnuri, Somashekhar Punnuri, Somashekhar Punnuri

PMC · DOI: 10.1371/journal.pone.0329366 · PLOS One · 2025-08-14

## TL;DR

This study shows that seed quality traits like weight and germination rate are strong predictors of grain yield in sorghum, offering a practical approach for breeding in resource-limited areas.

## Contribution

The study introduces a machine learning framework that identifies seed quality as a key predictor of sorghum yield in diverse African germplasm.

## Key findings

- Seed weight and germination rate were the strongest predictors of grain yield.
- A Neural Boosted model achieved a mean R2 of 0.36 for yield prediction.
- Disease resistance traits had limited predictive value for grain yield.

## Abstract

Accurately predicting grain yield remains a major challenge in sorghum breeding, particularly across genetically and geographically diverse germplasm. To address this, we applied a phenotype-informed machine learning (PIML) framework to analyze nine phenotypic traits in 179 Ethiopian and Senegalese accessions. Using hierarchical clustering and oversampling with ADASYN, we achieved high classification accuracy (0.99) for phenotypic group assignment. Grain yield prediction was most effective with a Neural Boosted model (NTanH(3)NBoost(8)), achieving a mean R2  of 0.36 and RASE (equivalent to RMSE) of 4.87. Feature importance analysis consistently identified seed weight and germination rate as the strongest predictors of grain yield, while disease resistance traits showed limited predictive value. These findings suggest that early selection based on seed quality traits may provide a practical strategy for improving sorghum yield under field conditions, especially in resource-limited environments.

## Linked entities

- **Species:** Sorghum (taxon 4557)

## Full-text entities

- **Species:** Sorghum bicolor (broomcorn, species) [taxon 4558]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12352784/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12352784/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/PMC12352784/full.md

---
Source: https://tomesphere.com/paper/PMC12352784