# Genomic prediction in a small barley population can benefit from training on related populations

**Authors:** Cathrine Kiel Skovbjerg, Pernille Sarup, Ellen Margrethe Wahlström, Jens Due Jensen, Lotte Olesen, Jihad Orabi, Just Jensen, Guillaume P Ramstein, Ahmed Jahoor

PMC · DOI: 10.1093/g3journal/jkaf218 · 2025-10-23

## TL;DR

This study shows that using data from related barley populations can improve predictions in new breeding programs, especially in the early stages.

## Contribution

The study introduces a method to enhance genomic prediction accuracy in small barley populations by leveraging data from related populations.

## Key findings

- Prediction accuracy in a new barley breeding program improved when data from external populations were included in early stages.
- Multivariate models performed worse than univariate models for multipopulation genomic prediction when data were sparse.
- Multipopulation genomic prediction was most beneficial in the initial phases of new breeding programs.

## Abstract

Genomic prediction (GP) has shown to be a valuable tool for genetic improvement in breeding programs but requires large training populations in order to build robust models. This is difficult to obtain for newly established breeding programs. Here, we aimed to overcome this challenge by combining datasets from 4 different barley breeding programs, utilizing up to 12 years of data to increase prediction accuracy in a more recently established 6-rowed winter (6RW) barley breeding program. By allowing data to accumulate in a breeding program as the years progress, we investigated when GP accuracy in 6RW benefitted from external populations. To do this, we focused on several parameters: training population size, choice of model for multipopulation GP (univariate versus multivariate), the key trait under investigation (grain yield, plant height, or rust resistance), and genetic distance between populations. We found that in the early stages of a breeding program, prediction of the 6RW population could benefit from inclusion of an external population, but the advantage depended on the specific population and trait under investigation. However, when data from all 4 years were available, multipopulation GP generally performed similarly to within-population GP. Additionally, when comparing multivariate and univariate models for multipopulation GP, the multivariate model often performed significantly worse, despite strong genetic correlations between the populations involved. This was especially the case when data were sparse and the model required estimation of numerous parameters from a small number of observations. Altogether, our results suggest that multipopulation GP is beneficial only in the very early stages of new breeding programs, emphasizing its relevance for newly established breeding programs or new breeding goals, especially for related populations.

Graphical Abstract

## Full-text entities

- **Diseases:** infection (MESH:D007239), Fusarium head blight (MESH:D006258), GP (MESH:D042822)
- **Chemicals:** 6RS (-), deoxynivalenol (MESH:C007262)
- **Species:** Oryza sativa (Asian cultivated rice, species) [taxon 4530], Malus domestica (apple, species) [taxon 3750], Hordeum vulgare (barley, species) [taxon 4513], Bos taurus (bovine, species) [taxon 9913], Solanum tuberosum (potatoes, species) [taxon 4113], Glycine max (soybean, species) [taxon 3847]
- **Cell lines:** PC3 — Homo sapiens (Human), Prostate carcinoma, Cancer cell line (CVCL_0035)

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12610402/full.md

---
Source: https://tomesphere.com/paper/PMC12610402