# Development of user‐selectable diverse sets of cultivated and wild soybean germplasm for genetic and breeding applications

**Authors:** Qijian Song, Susan Araya, Chuck Quigley, Patrick Elia

PMC · DOI: 10.1002/tpg2.70216 · The Plant Genome · 2026-03-09

## TL;DR

This paper introduces a method to create diverse soybean germplasm sets to preserve genetic diversity and aid breeding efforts.

## Contribution

The novel contribution is the development of a 'Soy-DS Selector' tool to generate tailored diverse soybean sets based on user preferences.

## Key findings

- Diverse sets (DS) captured 94.9%–98.4% of SNP diversity in cultivated soybean, outperforming random sets.
- The DS approach retained predicted diversity across 1308 cultivated and 203 wild soybean genomes.
- The Soy-DS Selector enables users to build custom sets based on maturity group, sample size, and seed availability.

## Abstract

After decades of intensive breeding, modern US soybean [Glycine max (L.) Merr.] varieties have achieved significant improvements in yield, quality, and stress tolerance, but these gains have come at the cost of severely reduced genetic diversity. To reduce vulnerability and promote efficient use of germplasm, diverse sets (DS) of varying sample sizes were defined for the entire USDAARS Soybean Germplasm Collection and 13 maturity groups using the SoySNP50K single‐nucleotide polymorphism (SNP) profile. The average retained genetic diversity of the 50K SNPs was then compared between 10 DS and 10 random sets (RSs) at different sizes. DS consistently outperformed random sampling: in cultivated soybean, DS captured 94.9%–98.4% of SNP diversity compared with 73.1%–93.9% for RS; in wild soybean, DS captured 92.8%–97.9% compared with 83.4%–97.7% for RS. The performance of DS was further validated using whole‐genome sequences from 1511 accessions, demonstrating that DS could retain the diversity predicted by the SNP subset across 1308 cultivated and 203 wild soybean genomes of different sample sizes. DS was also effective in capturing genetic diversity across different traits. To allow users to select DS, a “Soy‐DS Selector” approach was proposed, and a table containing germplasm clusters across the USDA collection and different maturity groups was created. This resource enables researchers to tailor combinations based on maturity groups, accession and sample size preferences, and seed availability. The study provides both methodology and resources that can streamline germplasm evaluation, maximize resource utilization, and enhance future genetic improvement in soybean. Several DS have already been used by US soybean breeders in their programs.

Methodology and resources were provided that can streamline germplasm evaluation, utilization, and enhance soybean improvement.Diverse sets of soybean germplasm were developed for the entire USDAARS Soybean Germplasm Collection and for different maturity groups.A Soy‐DS Selector table was created that allows users to easily build their own custom diverse sets based on maturity group, sample size, preferred germplasm, and so forth.Several soybean diverse sets have already been used by US soybean breeders for genetic or breeding research.

Methodology and resources were provided that can streamline germplasm evaluation, utilization, and enhance soybean improvement.

Diverse sets of soybean germplasm were developed for the entire USDAARS Soybean Germplasm Collection and for different maturity groups.

A Soy‐DS Selector table was created that allows users to easily build their own custom diverse sets based on maturity group, sample size, preferred germplasm, and so forth.

Several soybean diverse sets have already been used by US soybean breeders for genetic or breeding research.

US soybean breeding has long improved yield, quality, and resistance but sharply reduced genetic diversity—over 95% of modern lines come from only a few ancestors. To address this loss, researchers analyzed ∼20,000 cultivated and wild soybean accessions using ∼50,000 genetic markers to measure how genetically different each accession is from others. They built genetic diversity sets that capture the broadest possible variation and found that 300–400 well‐chosen accessions can represent most diversity in the entire collection. To support breeders, the team created a “Soy‐DS Selector” table, a simple tool that lets users generate customized germplasm diverse sets based on location, preferred materials, or seed availability. This approach gives breeders and researchers an efficient way to preserve, explore, and use soybean diversity for soybean improvement.

## Linked entities

- **Species:** Glycine max (taxon 3847)

## Full-text entities

- **Diseases:** RS (MESH:D001480), pests (MESH:D029021), drought (MESH:C536747), DS (MESH:D020920), frogeye leaf spot (MESH:D008796)
- **Chemicals:** amino acid (MESH:D000596), fatty acid (MESH:D005227), oil (MESH:D009821), ozone (MESH:D010126), methionine (MESH:D008715), sugar (MESH:D000073893)
- **Species:** Glycine soja (wild soybean, species) [taxon 3848], Oryza sativa (Asian cultivated rice, species) [taxon 4530], Glycine subgen. Soja (subgenus) [taxon 1462606], Glycine max (soybean, species) [taxon 3847]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12968749/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12968749/full.md

## References

66 references — full list in the complete paper: https://tomesphere.com/paper/PMC12968749/full.md

---
Source: https://tomesphere.com/paper/PMC12968749