# Near-infrared prediction of tannin content in walnut kernels using wavelet transform combined with interpretable machine learning models

**Authors:** Qiuhao Xia, Langqin Luo, Yerhazi Yerzati, Mian Muhammad Ahmed, Yonghao Chen, Shiwei Wang, Jiangnan Qin, Liping Chen, Qiang Jin, Zhongzhong Guo, Rui Zhang

PMC · DOI: 10.3389/fpls.2026.1746869 · Frontiers in Plant Science · 2026-02-06

## TL;DR

This study uses near-infrared spectroscopy and machine learning to quickly and accurately predict tannin levels in walnuts, improving quality assessment.

## Contribution

Combines wavelet transforms with interpretable machine learning for improved tannin prediction in walnut kernels.

## Key findings

- NIR reflectance of walnut kernels increases with tannin content across different orchard management modes.
- Combining first-order differential transformation and wavelet transform significantly improves model prediction performance.
- The optimal model achieved R² = 0.831 and RMSE = 1.620 in validation, showing strong predictive accuracy.

## Abstract

Tannin content is a key factor influencing the taste of walnuts and serves as an important index for evaluating walnut quality. Rapid and accurate detection of tannin levels in walnut kernels is therefore significant for quality assessment and management. This study aims to develop an efficient method for predicting tannin content in walnut kernels using near-infrared (NIR) spectroscopy combined with machine learning techniques.

A total of 180 samples of ‘Wen 185’ walnut kernels were used as the research objects. The NIR reflectance spectra of the samples were measured within the range of 4000–10000 cm⁻¹. The spectral data were processed using mathematical transformations and continuous wavelet transform (CWT), both separately and in combination. Pearson correlation analysis was applied to extract characteristic bands related to tannin content. Based on these features, a random forest (RF) model was constructed to quantitatively predict tannin content. Additionally, the SHAP algorithm was employed to interpret and visualize the machine learning model.

The results indicated that within the spectral range of 4000–10000 cm⁻¹, the NIR reflectance of walnut kernels increased with tannin content under different orchard management modes. Both first-order differential transformation and CWT, as well as their combination, significantly enhanced the correlation between spectral data and tannin content. The combination of first-order differential transformation and CWT notably improved the model's prediction performance. The optimal prediction model was achieved using the feature lg’(1/R)_CWT_28, with training set metrics of R² = 0.880, RMSE = 1.188, RPD = 2.904, and validation set metrics of R² = 0.831, RMSE = 1.620, RPD = 2.459.

The study demonstrates that combining mathematical transformations with wavelet transform can effectively improve the prediction accuracy of models for tannin content in walnut kernels. The RF model based on processed spectral data showed strong performance, indicating its potential for rapid and non-destructive tannin quantification. The use of SHAP algorithm further enhances model interpretability. These findings provide a valuable reference for the accurate prediction of tannin content in walnut kernels and may support quality control in walnut production and processing.

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** drought (MESH:C536747)
- **Chemicals:** sodium tungstate (MESH:C025399), gallic acid (MESH:D005707), sodium carbonate (MESH:C005686), sugar (MESH:D000073893), CWT (-), water (MESH:D014867), phosphomolybdic acid (MESH:C003125), ethanol (MESH:D000431), Tannin (MESH:D013634), EDTA (MESH:D004492), nitrogen (MESH:D009584), sodium molybdate (MESH:C024687)
- **Species:** Oryza sativa (Asian cultivated rice, species) [taxon 4530], Malus domestica (apple, species) [taxon 3750], Sorghum bicolor (broomcorn, species) [taxon 4558], Juglans regia (English walnut, species) [taxon 51240], Juglans (walnuts, genus) [taxon 16718]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12920457/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12920457/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12920457/full.md

---
Source: https://tomesphere.com/paper/PMC12920457