# Benchmarking Feature Selection Methods and Prediction Models for Flowering Time Prediction in Maize

**Authors:** Yan Du, Nianhua Jia, Yueli Wang, Ronglan Li, Ying Lu, Tobias Würschum, Xintian Zhu, Wenxin Liu

PMC · DOI: 10.3390/ijms27041635 · International Journal of Molecular Sciences · 2026-02-07

## TL;DR

This study evaluates machine learning methods for predicting maize flowering time and identifies key genes involved in its regulation.

## Contribution

The paper introduces a comprehensive benchmarking framework combining feature selection and prediction models for multi-omics data in maize.

## Key findings

- The study identified known regulators like ZmMADS69 and ZmRap2.7 using SHAP-based interpretation in random forests.
- Combining SNP and transcriptomic data improved prediction accuracy and gene discovery for flowering time.
- The framework revealed additional candidate genes in the maize flowering regulatory network.

## Abstract

Flowering time is a fundamental trait that determines crop adaptation and yield stability. To accurately predict flowering time and identify key regulatory factors, it is necessary to extract biologically meaningful signals from high-dimensional and multi-omics datasets. Although machine learning has been increasingly applied in plant genomics, there is still limited research on how feature selection (FS) methods and genomic prediction (GP) models affect the prediction of flowering time and gene discovery, particularly regarding the combination of different FS and GP approaches and the interpretability of prediction models. To address this gap, we conducted a large-scale benchmarking study that jointly evaluated seven feature selection methods and six prediction models, resulting in 42 FS–GP combinations. By integrating SNP and transcriptomic data, we assessed predictive performance and further interpreted model outputs using SHAP (SHapley Additive exPlanations) within a random forest (RF) framework to quantify feature contributions. This strategy successfully identified known flowering time regulators in maize, including ZmMADS69 and ZmRap2.7, and revealed additional candidate genes potentially involved in the flowering regulatory network. Overall, this study offers valuable insights into the genetic regulation of flowering time in maize and provides an effective framework for discovering candidate genes from multi-omics data for crop improvement.

## Full-text entities

- **Genes:** GRMZM2G420684 [NCBI Gene 103635879], GRMZM2G085438 [NCBI Gene 100382242], LOC100127519 (ZCN8 protein) [NCBI Gene 100127519] {aka GRMZM2G179264, ZCN8}, LOC103635944 (APETALA2-like protein 1) [NCBI Gene 103635944] {aka GRMZM2G700665, Rap2.7}, LOC100272251 (uncharacterized LOC100272251) [NCBI Gene 100272251] {aka GRMZM2G171650, MADS69, m22}
- **Diseases:** injury to (MESH:D014947), cancer (MESH:D009369)
- **Chemicals:** gibberellin (MESH:D005875), JA (-), lipid (MESH:D008055), GA (MESH:D005708)
- **Species:** Arabidopsis thaliana (mouse-ear cress, species) [taxon 3702], Zea mays (maize, species) [taxon 4577], Homo sapiens (human, species) [taxon 9606], Oryza sativa (Asian cultivated rice, species) [taxon 4530]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12941171/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12941171/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/PMC12941171/full.md

---
Source: https://tomesphere.com/paper/PMC12941171