# Generalizability and transferability of machine learning models using hyperspectral reflectance data for maize traits

**Authors:** Rudan Xu, John Ferguson, Matthieu Breil-Aubert, Johannes Kromdijk, Zoran Nikoloski

PMC · DOI: 10.1038/s41598-026-36819-1 · 2026-01-21

## TL;DR

This paper evaluates how well machine learning models can predict maize traits using hyperspectral data, finding that some traits generalize better than others.

## Contribution

The study provides a systematic benchmark for ML model performance and generalizability using hyperspectral reflectance data across multiple maize traits and environments.

## Key findings

- Structural and biochemical traits showed better generalizability and transferability compared to physiological traits.
- Physiological traits like gas exchange and fluorescence kinetics had lower model transferability.
- Optimal predictions depend on model type and data aggregation strategies.

## Abstract

Hyperspectral reflectance provides rapid, non-destructive phenotyping of plant leaves. These data have been used to develop machine learning models for predicting diverse plant traits, yet key challenges remain. We collected hyperspectral reflectance data together with 25 anatomical, gas exchange, and chlorophyll fluorescence traits from 320 recombinant inbred lines grown over three seasons. Using these data, we systematically (1) compare the performance of PLSR and SVR across a wide range of traits, including also slow fluorescence kinetics, (2) assess model generalizability and transferability, and (3) investigate how different aggregation strategies affect predictive accuracy. Based on a nested cross-validation framework, single cross-validation with MSE as metric performed comparably to repeated cross-validation or PRESS-based calibration. Optimal performance of trait-specific predictions was found to be dependent on the combination of model and data aggregation levels. Structural and biochemical traits showed the best generalizability and transferability, whereas physiological traits, particularly those derived from gas exchange and fluorescence kinetics, exhibited markedly reduced transferability. Together, these results provide a rigorous benchmark for evaluating machine learning models for trait prediction from hyperspectral reflectance data, and highlight both the opportunities and limitations for achieving robust generalization across diverse environments and genotypes.

The online version contains supplementary material available at 10.1038/s41598-026-36819-1.

## Full-text entities

- **Chemicals:** chlorophyll (MESH:D002734)

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12894961/full.md

---
Source: https://tomesphere.com/paper/PMC12894961