# Bayesian variable selection for genome-wide association study of grain traits in rice

**Authors:** Rupam Basu, Sabyasachi Mukhopadhyay, Kaustubh Adhikari, Aimin Zhang, Aimin Zhang, Aimin Zhang

PMC · DOI: 10.1371/journal.pone.0344021 · PLOS One · 2026-03-17

## TL;DR

This paper compares Bayesian and frequentist methods for predicting rice traits using genetic data, finding Bayesian approaches more effective for accurate predictions and variable selection.

## Contribution

The study introduces and evaluates a Bayesian spike-and-slab prior model for rice GWAS, demonstrating superior predictive performance over classical methods.

## Key findings

- The Bayesian spike-and-slab prior model outperformed frequentist methods in prediction accuracy and variable selection.
- Bayesian methods showed better stability and inferential characteristics compared to traditional models.
- Phenotypic traits were effectively modeled using cross-validation metrics like mean squared error and predictive correlation.

## Abstract

Rice (Oryza sativa) is a staple food crop for more than half of the world‘s population. Besides high gluten-free nutritional contents, it has high economic value supporting livelihood of millions of farmers. That is why a lot of research is being carried out to derive new varieties of rice and improve its yield, stress tolerance, and grain quality. It remains a central goal in agricultural research. Genome-wide association studies (GWAS) provide a powerful framework for linking genetic variation to complex phenotypic traits, but the high dimensionality of genomic data presents significant challenges for model selection and prediction. Using rice genotype and phenotype data, we compared the performance of several frequentist and Bayesian modeling approaches: multiple linear regression (OLS: Ordinary Least Squares), LASSO (Least Absolute Shrinkage and Selection Operator), Ridge, Bayesian LASSO, Bayesian Sparse Linear Mixed Model (BSLMM), and a Bayesian spike-and-slab prior model. Phenotypic traits were transformed where necessary to approximate normality, and predictive performance was evaluated through cross-validation using mean squared error and predictive correlation. The spike-and-slab prior model often outperformed the classical methods, yielding superior prediction and effective variable selection. Our findings demonstrate the value of Bayesian model selection frameworks for plant GWAS and trait prediction, and highlight the effectiveness of Bayesian methods in identifying informative markers in rice. Such approaches hold promise for accelerating genetic improvement and supporting marker-assisted selection in crop breeding programs. Rather than emphasizing biological interpretation of individual loci, our results highlight differences in predictive behavior, stability, and inferential characteristics across models.

## Linked entities

- **Species:** Oryza sativa (taxon 4530)

## Full-text entities

- **Diseases:** burn (MESH:D002056), GRLT (MESH:D007870), MCMC (MESH:D007161), BSLMM (MESH:D004195)
- **Chemicals:** PONE-D-25-56999R1 (-)
- **Species:** Oryza sativa (Asian cultivated rice, species) [taxon 4530]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12994784/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12994784/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/PMC12994784/full.md

---
Source: https://tomesphere.com/paper/PMC12994784