# Quantitative Analysis of Polyphenols in Lonicera caerulea Based on Mid-Infrared Spectroscopy and Hybrid Variable Selection

**Authors:** Haiwei Wu, Xuexin Li, Jianwei Liu, Zhihao Wang, Yuchun Liu

PMC · DOI: 10.3390/molecules31040750 · Molecules · 2026-02-23

## TL;DR

This study develops a new method using mid-infrared spectroscopy and a hybrid variable selection strategy to accurately and rapidly measure polyphenol content in blue honeysuckle.

## Contribution

A novel hybrid variable selection strategy is proposed for high-dimensional, small-sample MIR spectroscopy data to improve polyphenol quantification.

## Key findings

- The hybrid variable selection method achieved 86.8% dimensionality reduction while maintaining high accuracy.
- The optimized XGBoost model outperformed classical methods with R2 = 0.92 and RPD = 3.47.
- The method enables rapid, non-destructive polyphenol analysis in Lonicera caerulea.

## Abstract

Lonicera caerulea L. (blue honeysuckle) is rich in antioxidant polyphenols, and rapid and accurate determination of its polyphenol content is of great significance for functional food quality control. This study proposed a hybrid variable selection strategy designed for high-dimensional small-sample scenarios and developed a quantitative prediction model for polyphenol content based on mid-infrared (MIR) spectroscopy. A total of 191 Lonicera caerulea samples were collected from Northeast China, and 7468-dimensional spectral data were acquired using a Fourier transform infrared spectrometer. Polyphenol reference values were determined by the Folin–Ciocalteu method. Samples were divided into calibration (n = 152) and prediction (n = 39) sets using the SPXY algorithm. Among the 10 preprocessing methods evaluated, MSC combined with Savitzky–Golay first derivative achieved the best performance and was therefore used for subsequent modeling. The proposed hybrid variable selection method (VIP1.0∩RFR30%) intersected PLS variable importance in projection (VIP ≥ 1.0) with the top 30% important variables from random forest regression, selecting 984 key wavelengths and achieving 86.8% dimensionality reduction. A three-stage hyperparameter tuning strategy was implemented across four models (PLS, RFR, SVR, and XGBoost) to validate feature stability and control overfitting. The optimized XGBoost model achieved excellent performance on the independent test set (R2 = 0.92, RMSE = 0.098, RPD = 3.47). Compared with the classical CARS method (R2 = 0.78, RPD = 2.14), R2 improved by 16.3% and RPD improved by 55.2%. The results demonstrate that the proposed hybrid variable selection strategy can effectively address the challenges of high-dimensional MIR spectral data in small-sample modeling, providing a reliable tool for rapid and non-destructive quantitative analysis of polyphenols in Lonicera caerulea.

## Linked entities

- **Species:** Lonicera caerulea (taxon 134520)

## Full-text entities

- **Genes:** VIP (vasoactive intestinal peptide) [NCBI Gene 7432] {aka PHM27}
- **Diseases:** injury to (MESH:D014947), inflammatory (MESH:D007249), RPD (MESH:D010262)
- **Chemicals:** Deuterated Triglycine Sulfate (-), phenols (MESH:D010636), Na2CO3 (MESH:C005686), CO2 (MESH:D002245), Polyphenol (MESH:D059808), chlorogenic acid (MESH:D002726), cyanidin-3-O-glucoside (MESH:C462279), flavonoids (MESH:D005419), caffeic acid (MESH:C040048), anthocyanin (MESH:D000872), gallic acid (MESH:D005707), sugar (MESH:D000073893), ester (MESH:D004952), H2O (MESH:D014867), quinic acid (MESH:D011801), ethanol (MESH:D000431), polyethylene (MESH:D020959), diamond (MESH:D018130)
- **Species:** Homo sapiens (human, species) [taxon 9606], Lonicera caerulea (blue honeysuckle, species) [taxon 134520]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12942758/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12942758/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC12942758/full.md

---
Source: https://tomesphere.com/paper/PMC12942758