# Leveraging Fst and Genetic Distance to Optimize Reference Sets for Enhanced Cross-Population Genomic Prediction

**Authors:** Le Zhou, Lin Zhu, Fengying Ma, Mingjuan Gu, Risu Na, Wenguang Zhang

PMC · DOI: 10.3390/ani16030359 · Animals : an Open Access Journal from MDPI · 2026-01-23

## TL;DR

This paper introduces a method using genetic similarity metrics to improve genomic prediction accuracy across different populations.

## Contribution

A novel Fst-based strategy is proposed to construct cross-population reference sets for enhanced genomic prediction.

## Key findings

- Including top 10–20% genetically similar individuals significantly improves prediction accuracy and robustness.
- ssGBLUP and wGBLUP methods showed the best performance with increased mixing proportions up to 20%.

## Abstract

Genomic prediction across populations often suffers from low accuracy due to genetic differences. This study introduces an Fst-based approach to select individuals from other populations that are genetically similar to the target population. By including the top 10–20% most similar individuals, prediction accuracy and robustness were significantly improved, with methods like ssGBLUP performing best. This strategy helps reduce bias and enhances breeding efficiency across diverse populations.

Genomic selection often faces challenges of insufficient prediction accuracy in cross-population applications, primarily due to differences in linkage disequilibrium patterns between populations. This study proposes an Fst-based strategy to enhance prediction performance by constructing a cross-population reference set with high genetic similarity to the target population (PopA). By integrating Fst-mediated SNP screening and Euclidean genetic distance analysis, the top 10%, 15% and 20% of individuals genetically most similar to PopA were screened from PopB and PopC, respectively, leading to the generation of six reference sets characterized by different mixing proportions. The results demonstrate that incorporating the top 10–20% of the most similar individuals significantly improves the accuracy and robustness of genomic estimated breeding value predictions. Among the methods evaluated, ssGBLUP and wGBLUP performed best, with prediction accuracy increasing as the mixing proportion rose up to 20%. This approach effectively mitigates structural bias caused by inter-population genetic differences and significantly enhances prediction efficiency. The multi-level mixing experiment not only validates the practical value of Fst and Euclidean distance but also provides theoretical support and a feasible solution for the efficient integration of cross-population germplasm resources.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12896712/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12896712/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12896712/full.md

---
Source: https://tomesphere.com/paper/PMC12896712