# A topological map of the genetic components of grapevine—Admixture meets SOMmelier machine learning

**Authors:** Anush Baloyan, Tomas Konecny, Emma Hovhannisyan, Nate Zadirako, Maria Nikoghosyan, Hans Binder

PMC · DOI: 10.1371/journal.pcbi.1013882 · PLOS Computational Biology · 2026-02-20

## TL;DR

This paper introduces SOMmelier, a machine learning tool that maps grapevine genetic diversity, revealing how genetic components relate to geography and history.

## Contribution

The novel contribution is the integration of SOM-based machine learning with genetic admixture analysis to create a topology-aware genetic landscape.

## Key findings

- SOMmelier recovers genetic components identified by Admixture through statistical clustering.
- The genetic landscape mirrors the geographic and historical spread of grapevine cultivation.
- SOMmelier complements and extends traditional admixture analysis in population genetics.

## Abstract

Inferring the genetic structure at the subpopulation level is crucial for understanding the demographic histories that shape genetic diversity. Among the most widely used approaches are methods based on admixture and structure modeling—named after the respective software tools—which have become standard due to their intuitive, interpretable outputs. In this study, we address a key methodological question: how does the traditional admixture-based decomposition of genetic components in multilocus population data relate to clustering approaches that leverage machine learning, specifically Self-Organizing Maps (SOMs)? We implemented this approach through our custom SOM-based tool, SOMmelier, which enables the portrayal of genetic structure by identifying modules of co-mutated SNPs and arranging them in a topology-aware genetic landscape. Topology-awareness refers to the organization of genetic modules in a two-dimensional map, where their spatial proximity reflects mutual similarity. We applied Admixture and SOMmelier to investigate the population genetics of European grapevine. Based on prior literature, we considered up to six genetic components, which formed a genetic landscape that closely mirrors the geographic expanse of the classical Mediterranean world—from Western Asia through the Caucasus to Western Europe. The resulting topology reflects the dynamic spatial and temporal nature of grapevine domestication and diffusion. We demonstrate that SOMmelier can recover the genetic components identified by Admixture solely through statistical clustering. By integrating the topological structure of SNP co-variation, it offers perspectives on population structure, evolutionary history, and trait associations in grapevine—and has applicability to other species and systems in population genetics.

Populations are shaped by both evolutionary processes and human activities such as breeding, which is especially evident in cultivated animals and plants. The genetic variation within these populations is encoded in their genomes, and can often be described as a combination of distinct genetic “admixture” components using standard computational approaches. In this study, we ask: How does this admixture-based view of population structure compare to the representation provided by machine-learning–based Self-Organizing Maps (SOMs)? SOMs offer an intuitive way to explore complex molecular data and reveal relationships that might be missed by conventional methods. Using cultivated grapevine as a model—an economically important, globally distributed crop with a long history of domestication—we show that our SOMmelier approach not only recapitulates known genomic components but also constructs a topology-aware genetic landscape. This landscape reflects the geographic distribution of grapevine accessions across Europe and West Asia, and preserves genetic footprints of cultivation history spanning the past 11,000 years. Importantly, SOMmelier both complements and extends genetic admixture analysis, highlighting its potential for broad application in population genetics beyond grapevine.

## Linked entities

- **Species:** Vitis vinifera (taxon 29760)

## Full-text entities

- **Diseases:** cancer (MESH:D009369)
- **Chemicals:** nitrogen (MESH:D009584), amide (MESH:D000577), SOM (-), K (MESH:D011188), amino acids (MESH:D000596), FAD (MESH:D005182), ATP (MESH:D000255), calcium (MESH:D002118)
- **Species:** Canis lupus familiaris (dog, subspecies) [taxon 9615], Severe acute respiratory syndrome coronavirus 2 (no rank) [taxon 2697049], Vitis vinifera (wine grape, species) [taxon 29760], Solanum lycopersicum (tomato, species) [taxon 4081], Homo sapiens (human, species) [taxon 9606], Felis catus (cat, species) [taxon 9685], Oryza sativa (Asian cultivated rice, species) [taxon 4530], Equus caballus (domestic horse, species) [taxon 9796]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12948125/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12948125/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/PMC12948125/full.md

---
Source: https://tomesphere.com/paper/PMC12948125