# Multimodal perishable fruits and vegetables dataset

**Authors:** Devika Unnikrishnan, Krishna Deepak, Yogini Aishwaryaa P T S, Bagyammal T

PMC · DOI: 10.1016/j.dib.2026.112545 · 2026-02-06

## TL;DR

This paper introduces a new dataset for assessing the freshness of fruits and vegetables using multiple data types to support smart agriculture and reduce food waste.

## Contribution

The paper presents a novel multimodal dataset for non-invasive freshness assessment of perishable produce.

## Key findings

- The dataset includes IR-fusion, sRGB images, and methane readings for six Indian fruits and vegetables.
- It supports research in spoilage detection and shelf-life prediction using multimodal data fusion.
- The dataset aligns with Agriculture 5.0 goals by enabling intelligent, automated quality assessment.

## Abstract

There is a growing need in the agricultural industry for non-invasive methods to classify the freshness and quality of produce. To address this, we developed a multimodal dataset comprising six commonly exported fruits and vegetables from India: guava, carrot, tomato, Indian gooseberry, banana, and mango. The specimens were allowed to undergo decomposition in an indoor environment with natural lighting, ambient temperature fluctuations and controlled air-flow. During this process, IR-Fusion images, sRGB images, and methane concentration readings were collected over a varying period and compiled. The dataset supports research in classification, food spoilage detection, shelf-life prediction, multimodal data fusion, non-invasive fruit quality assessment, and deep learning-based freshness assessment, particularly for export-oriented supply chains. The dataset is motivated by the need to reduce post-harvest losses and improve food quality monitoring, where spoilage indicators are often not detectable through visual inspection alone; the integration of imaging and gas-based sensing enables more reliable and automated freshness assessment. The dataset, with a total size of 18.99 GB, contains over 14,000 sRGB images, 14,500 IR-fusion images, and 18 methane sensor files, organized into Normal and Classified (Spoiled/Not_spoiled) categories. This multimodal design enables the study of thermal, visual, and chemical spoilage indicators simultaneously. This work aligns with the principles of smart agriculture, which promote the use of modern, data-driven technologies to optimize resource use and enable real-time monitoring for sustainable and efficient agricultural practices. Within the Agriculture 5.0 (AG 5.0) paradigm, the integration of Artificial Intelligence (AI), the Internet of Things (IoT), and complementary sensing modalities plays a central role in advancing innovation across farming and post-harvest management. In this context, the proposed multimodal dataset supports AG 5.0 objectives by enabling intelligent, automated, and non-invasive produce quality assessment, thereby improving decision-making and reducing waste throughout agricultural and export-oriented supply chains.

## Linked entities

- **Chemicals:** methane (PubChem CID 297)

## Full-text entities

- **Diseases:** fungal (MESH:D009181), bruising (MESH:D003288)
- **Chemicals:** MQ-4 (-), Methane (MESH:D008697)
- **Species:** Solanum lycopersicum (tomato, species) [taxon 4081], Emblica officinalis (amla, species) [taxon 296036], Daucus carota (carrot, species) [taxon 4039], Mangifera indica (mango, species) [taxon 29780], Musa acuminata (banana, species) [taxon 4641]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12925515/full.md

---
Source: https://tomesphere.com/paper/PMC12925515