# Machine learning classification of mango maturity based on carotene content from Raman spectra

**Authors:** Ji Loun Tan, Fazida Hanim Hashim, Jahariah Sampe, Aqilah Baseri Huddin, Ghassan Maan Salim, Sawal Hamid Md Ali

PMC · DOI: 10.7717/peerj.20288 · PeerJ · 2025-11-18

## TL;DR

This paper presents a non-invasive method using Raman spectroscopy and machine learning to accurately classify mango ripeness based on carotene content.

## Contribution

The study introduces a reliable Raman spectroscopy-based method with 100% accuracy for mango ripeness classification using machine learning models.

## Key findings

- Raman spectra revealed carotenoids like lycopene and β-carotene in mango peel, linked to ripeness levels.
- Machine learning models achieved 100% accuracy in classifying underripe, ripe, and overripe mangoes.
- Raman spectroscopy proved robust against external factors like light and humidity.

## Abstract

Determining mango ripeness is essential for ensuring its delicious taste, enticing aroma, and rich nutritional value. For farmers, harvesting mangoes too early can result in stunted fruit and lower yields compared to those harvested at a ripe stage. This study aims to develop a potentially non-invasive and efficient method for detecting mango ripeness using Raman spectroscopy. Traditional methods, which rely on human assessment and color evaluation with image processing, are inconsistent, inaccurate, and time-consuming due to variations in mango color and individual differences in vision and perception. To address these limitations, this study pursued three main objectives: extracting data characteristics of organic compounds in mangoes based on raw Raman spectrum data, identifying the correlation between carotene characteristics and mango ripeness levels, and evaluating the performance of machine learning models in classifying mango ripeness levels. A total of  29 mango fruit spectra were analyzed, with 13 samples selected to represent three ripeness categories: underripe, ripe, and overripe. Raman spectra peak signal analysis revealed that mango peel contains lycopene, β-carotene, lutein, and neoxanthin, all of which are derived from carotenoid molecules in the range of 1,480 cm−1 to 1,550 cm−1. Statistical analysis confirmed the significance (p < 0.05) of extracted Raman Peak Intensity features in distinguishing ripeness levels, supported by high correlation coefficients between carotenoid peak intensity and mango maturity. This study achieved 100% accuracy in classifying mango ripeness levels using three classifier models: the Medium Gaussian Support Vector Machine, the Cubic Support Vector Machine, and the Weighted K-Nearest Neighbors. Raman spectroscopy has proven to be a reliable and robust method, immune to external factors such as light, humidity, and noise, which makes it a promising approach for assessing mango ripeness.

## Linked entities

- **Chemicals:** lycopene (PubChem CID 446925), β-carotene (PubChem CID 573), lutein (PubChem CID 181579), neoxanthin (PubChem CID 5282217), carotenoid (PubChem CID 11227325)

## Full-text entities

- **Diseases:** stunted (MESH:D006130)
- **Chemicals:** lutein (MESH:D014975), lycopene (MESH:D000077276), carotene (MESH:D002338), beta-carotene (MESH:D019207), neoxanthin (MESH:C011947)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mangifera indica (mango, species) [taxon 29780]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12637035/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12637035/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/PMC12637035/full.md

---
Source: https://tomesphere.com/paper/PMC12637035