# Applying traditional and machine learning-based GWAS approaches for marker-trait identification in wheat

**Authors:** Joel Joshua Milek, Sebastian Michel, Alexander Buchelt, Andreas Holzinger, Eva Maria Molin

PMC · DOI: 10.3389/fpls.2025.1734247 · 2026-01-28

## TL;DR

This paper compares traditional and machine learning methods for identifying genetic markers linked to traits in wheat, showing that machine learning can detect additional markers.

## Contribution

The study demonstrates how machine learning complements traditional GWAS by capturing non-linear genetic effects in wheat.

## Key findings

- Traditional GWAS tools showed variability in runtime and marker-trait associations.
- Machine learning models identified novel markers not detected by traditional methods.
- ML approaches enhance detection of complex genetic signals in wheat traits.

## Abstract

Complex traits arise from polygenic and interactive genomic architectures that are difficult to resolve using traditional genome-wide association study (GWAS) approaches. Machine learning (ML) provides complementary methods capable of capturing non-linear effects, improving signal detection, and enhancing predictive accuracy of marker trait associations (MTAs).

Using a publicly available winter wheat dataset (CIMMYT), we evaluated several widely used traditional GWAS tools, including GAPIT, GCTA, GEMMA, sommer, and TASSEL, with respect to computational efficiency, model performance, and the consistency of detected associations. In parallel, ML approaches, such as Elastic Net, Extreme Gradient Boosting (XGBoost), Random Forest, and the hybrid TSLRF model, were assessed based on feature importance metrics and functional annotation of selected markers.

Despite a shared reliance on mixed linear models, the traditional GWAS tools exhibited differences in runtime and showed modest but meaningful variability in the number and overlap of MTAs. ML models recovered several associations detected by traditional methods and additionally identified novel markers, potentially reflecting non-linear or epistatic effects.

Our findings demonstrate that ML can effectively complement traditional GWAS approaches for marker-trait identification in wheat. By extending beyond additive effects, ML broadens the scope of detectable genetic signals, providing a practical way to analyze complex traits and support informed marker-assisted breeding strategies.

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12891156/full.md

---
Source: https://tomesphere.com/paper/PMC12891156