BrinjalFruitX: A field-collected image dataset for machine learning and deep learning-based disease identification in brinjal fruits
Abu Kowshir Bitto, Md. Zahid Hasan, Md. Hasan Imam Bijoy, Khalid Been Badruzzaman Biplob, Mohammad Mahadi Hassan, Mohammad Shohel Rana, Abdul Kadar Muhammad Masum

TL;DR
This paper introduces a new dataset of brinjal fruit images for training machine learning models to detect diseases, aiming to improve crop management and reduce losses.
Contribution
The novel contribution is a comprehensive, field-collected dataset of brinjal fruit diseases with five classes, specifically for deep learning applications.
Findings
The dataset contains 1823 high-quality labeled images collected from real farm conditions in Bangladesh.
It includes five distinct classes of brinjal fruit diseases and healthy fruits for accurate disease detection training.
The dataset is intended to support precision agriculture and improve early disease detection for farmers.
Abstract
Brinjal (Solanum melongena) or eggplant is one of the four most essential vegetable crops that are grown in Bangladesh and contribute significantly to the agricultural industry of the country. Brinjal supports the livelihood of numerous small farmers; however, brinjal is severely susceptible to various fruit diseases, which have serious impacts on yield quality and may cause considerable economic losses. While most existing plant disease datasets primarily focus on leaf-related disorders, only a limited number include fruit-related diseases and even those contain very few classes. This gap is significant because fruit diseases directly affect crop quality, market value, and overall yield. This is why we present here a new and comprehensive dataset that is unparalleled, exclusively for brinjal fruit diseases. This data set consists of 1823 high-quality, labelled images, across five…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Agriculture and AI · Plant Disease Management Techniques · Irrigation Practices and Water Management
Specifications TableSubjectComputer ScienceSpecific subject areaImage Classification, Image Identification, Deep Learning, and Computer VisionType of dataImage (.JPG)Data collectionThe data was collected from various farming areas in Bangladesh, including Bogura, and Dhaka, during January and February 2025. It represents a wide range of environmental conditions, captured with high-resolution mobile cameras like the Poco F3 and Samsung A52. The aim of the dataset is to support deep learning applications for disease detection, ultimately helping farmers manage crops more effectively and reduce financial losses caused by fruit diseases.Data source locationBogura, BangladeshLatitude: 24.8483° N, Longitude: 89.2220° EZone: Rajshahi, Country: BangladeshDhaka, BangladeshLatitude: 23.8103° N, Longitude: 90.3200° EZone: Dhaka, Country: BangladeshData accessibilityRepository name: Mendeley Data…Data identification number: 10.17632/ngc58fsxgd.1Direct URL to data: https://data.mendeley.com/datasets/ngc58fsxgd/1Related research articleNone
Value of the Data
1
- •This dataset gives a pioneering and highly specific collection of images committed to brinjal (eggplant) fruit disease in isolation, an area significantly underrepresented within existing plant disease collections. It consists of 1823 high-resolution images organized into five distinct classes: Phomopsis Blight, Shoot and Fruit Borer, Brinjal Fruit Cracking, Wet Rot, and Healthy Fruit. This dataset is crucial to the search for early disease identification in brinjal, a crucial crop in Bangladesh, as well as the closing of an important knowledge gap in plant disease research.
- •The images contained in this dataset were captured under actual-field conditions under different weather conditions, and thus it is extremely suitable for training deep learning models for real-field, ground-level disease identification. This renders it a fundamental instrument in the development of AI-powered tools for brinjal crop health management by farmers to prevent economic loss due to fruit diseases. With the incorporation of photographs captured under varied environmental conditions, this dataset ensures the promise of creating consistent models that can perform better and working under changing agricultural conditions.
- •Unlike most datasets available to date, focusing primarily on leaf diseases, only a limited number include fruit-related diseases and even those contain very few classes. This dataset directly targets fruit-specific diseases, marking a pioneering work in plant disease research. Its unique feature of focusing on diseases of brinjal fruits will help researchers develop customized algorithms capable of detecting issues at the level of the fruit because of diseases, thus improving the quality and marketability of brinjal. This holds a tremendous scope for agricultural growth and improved measures for controlling diseases for offering better-quality produce and greater revenue for farmers.
- •This rich image dataset of diverse images is well-suited to a range of machine learning applications, including image classification, disease diagnosis, and crop management optimization. Its heavy labeling of disease classes offers a solid platform upon which to build diagnostic models that can be immediately employed in precision farming systems to allow farmers to make timely, informed decisions about pesticide application, harvesting, and post-harvest treatment.
- •The dataset’s design also makes it suitable for use in the development of Internet of Things (IoT) systems and smart agricultural technologies. Adding this dataset to real-time monitoring software helps researchers develop smart sensors that can detect disease at the onset, providing farmers immediate alert. This integration can enhance food traceability, ensure better food safety standards, and reduce the application of chemical treatments, thus facilitating sustainable agriculture.
- •Apart from its direct use in detecting brinjal disease, this dataset has broader applications in agricultural technology, such as crop disease monitoring and public health surveillance. It is beneficial for domains like computer vision, agricultural informatics, and decision-making systems based on AI. Such a rich dataset can enable researchers to benchmark algorithms, compare the performance of disease detection systems, and further research for automated quality control systems for agriculture.
Background
2
The main motivation behind preparing this dataset is to aid in solving the serious issue of brinjal (eggplant) plant disease detection, a vital crop of Bangladesh. Brinjal falls among the top four most valuable vegetable crops of Bangladesh, and many small-scale farmers' livelihood relies on it. However, brinjal is highly susceptible to various fruit diseases, which have a negative effect on yield and price. Although fruit-related diseases are important, there was a major gap in all the available data sets focusing primarily on leaf diseases and not fruit-related diseases. Only a limited number include fruit-related diseases and even those contain very few classes and an insufficient number of samples. It was with the aim to bridge this gap and assist in the development of machine learning and computer vision models for the detection of brinjal fruit disease at an early stage that this data set was established.
The data set was created by taking high-resolution pictures of brinjal fruits with five various diseases—Phomopsis Blight, Shoot and Fruit Borer, Brinjal Fruit Cracking, Wet Rot, and Healthy Fruit—straight from real farm conditions from across Bangladesh. The pictures were taken from various farming regions, like Bogura, Dhaka, to ensure variety in environmental conditions such as lighting, weather, and humidity. This approach ensures that the dataset reflects actual on-farm conditions, making it possible to utilize it in model training that can generalize to different farming conditions.
This dataset is designed to facilitate ongoing research for automatic fruit disease detection and classification and thus improve crop management practice. The dataset is an invaluable resource for developing AI-powered tools that will allow farmers to detect diseases at an early stage, optimize the utilization of resources, reduce dependence on chemicals, and optimize agricultural productivity and food safety in Bangladesh and other similar regions.
Data Description
3
In recent years, the use of image sets for plant disease detection has become more important, driven mostly by increased use of Artificial Intelligence in agri-technology. Although common datasets like PlantVillage helped researchers in making strides, most of them are only interested in diseases found on leaves and do not pay much attention to those manifested on fruits. This is particularly true for brinjal, a key vegetable crop in Bangladesh. Existing brinjal datasets are characterized primarily in terms of leaves and do not address adequately the variety of diseases that afflict the fruit. Single research groups have created small sets of images, but they're not published or disseminated to the wider scientific and farming communities. Such lack of brinjal fruit disease data is a significant roadblock to the development of useful detection tools. To fill this lacuna, we have developed a new, authentic dataset consisting entirely of images of brinjal fruit diseases captured directly from farms across Bangladesh. These images reflect real farm conditions, with diverse weather and environmental settings, which affect disease growth and appearance. Brinjal or eggplant is one of the four most important vegetable crops that are being grown in Bangladesh and that contribute to the livelihood of many small farmers. Fruit diseases reduce yields and cause economic losses, thus early and accurate diagnosis is crucial. Our data collection contains 1823 high-resolution images that are hand-tagged to five categories: Phomopsis Blight, Shoot and Fruit Borer, Fruit Cracking, Wet Rot, and Healthy Fruit. Our data collection is expected to spur research in plant pathology and assist in the construction of reliable AI models for early disease identification. Finally, we hope that this work will help improve crop management practices so that farmers can enhance the quality of their harvest and generate a better income.
During January and February 2025, a particular dataset based on brinjal fruit diseases was created by using images that were collected from major agricultural sites of Bogura, Dhaka. Images were taken under real field conditions, in various weather conditions—sunny morning, cloudy afternoon, and foggy morning—with temperature ranging from 18 °C to 24 °C and humidity ranging between 65 % and 80 %. High-resolution phones such as the Poco F3, Samsung A52 were used to obtain sharp and high-quality images. In contrast to most existing collections that are leaf-oriented infections, this dataset specifically targets fruit-oriented diseases. It has five clearly demarcated classes: Phomopsis Blight, which causes brown spots and internal rot due to Phomopsis vexans [1]; Fruit and Shoot Borer, an insect (Leucinodes orbonalis) that consumes fruit and stem from within [2]; Brinjal Fruit Cracking, a disorder typically linked to uneven moisture content or rapid growth, leading to cracked skin [3]; Brinjal Wet Rot, a fungal infection due to Choanephora cucurbitarum and Rhizopus Stolonifer, with soft, decaying tissue [4]; Healthy Brinjal, showing fruits with no visible disease. Every image of the dataset was checked carefully, annotated, and verified with the help of Daffodil International University's Department of Agriculture so that it is of the required standards to detect diseases accurately and can be used in machine learning applications.
Table 1 and Fig. 1 present the distribution of images over the different disease classes in our dataset, which is particularly targeting fruit-related diseases. A total of 2223 images was initially gathered from various sources; however, after scrupulously eliminating duplicates, biased samples, and poor-quality images inappropriate for training a model, the final dataset consists of 1823 good-quality images. These photos fall under the following categories: 200 Fruit Cracking, 514 Healthy Brinjal, 161 Phomopsis Blight, 725 Fruit and Shoot Borer, and 223 Wet Rot.Table 1. Category-wise data distribution.Table 1 dummy alt textCategoryNumber of original imagesShoot and Fruit Borer725Healthy514Wet Rot223Brinjal Fruit Cracking200Phomopsis Blight161Total1823Fig. 1Dataset distribution graph.Fig 1 dummy alt text
We can see from the Table 2 that the observations are divided into five classes representing an inconsistent brinjal fruit condition. Shoot and Fruit Borer (SFB) is a product of infestation by larvae of Leucinodes orbonalis, leading to internal fruit rot and decay. Wet Rot (WR), a product of infestation by Colletotrichum spp., forms water-soaked lesions that form, blacken, and rot. Brinjal Fruit Cracking (BFC) is a physiological disorder caused by environmental stress, leading to cracking of the fruit skin. Phomopsis Blight (PB), caused by infection of the pathogen Phomopsis vexans, leads to dark sunken lesions resulting in internal rotting. Brinjal fruits that are healthy are firm, smooth, and free from any defect, indicating good health and crop management. These classes help in the identification and knowledge of various diseases and conditions of brinjal fruit quality.Table 2. Class-wise data description with visualization.Table 2 dummy alt textClassDefinitionSampleShoot and Fruit BorerThis disease is caused by the larval stage of the insect Leucinodes orbonalis. The larvae bore into both shoots and fruits, causing internal damage that is not immediately visible from the outside. Infested fruits show exit holes and often rot from the inside, making them unfit for consumption or sale. BFSB is one of the most destructive pests affecting brinjal production.Image, table 2 dummy alt textWet RotCaused by Colletotrichum spp., this disease presents as small, water-soaked lesions on the fruit that expand and turn dark with time. These lesions often crack or sink into the fruit, resulting in tissue breakdown and rot. The disease reduces the visual and structural quality of the fruit, affecting its commercial value.Image, table 2 dummy alt textBrinjal Fruit CrackingFruit cracking in brinjal is a physiological disorder often caused by irregular watering, rapid fruit growth, or sudden changes in humidity. Cracks typically appear on the outer skin of mature fruits and may expose the inner flesh, making the fruit vulnerable to secondary infections and reducing market value. It is not caused by pathogens but is often aggravated by environmental stress and nutrient imbalance.Image, table 2 dummy alt textPhomopsis BlightPhomopsis fruit rot is caused by the fungal pathogen Phomopsis vexans. It typically manifests as dark, sunken lesions on the surface of the brinjal fruit, which gradually enlarge and lead to internal decay. The disease thrives in warm, humid environments and significantly reduces fruit quality and marketability due to soft rotting.Image, table 2 dummy alt textHealthyA healthy brinjal fruit is firm, smooth, and glossy with a uniform shape and deep color (typically purple, green, or white depending on the variety). It is free from visible deformities, lesions, or insect damage. Healthy fruits contribute to high marketability and reflect proper crop management and disease control practices in the field.Image, table 2 dummy alt text
We can see from the Table 3 that various studies have been accountable for creating leaf disease datasets for brinjal and other crops. The datasets, as referenced from [[5], [6], [7], [8], [9]], are focused on different aspects of leaf disease classification, such as large-scale datasets, balanced classes, and real-world, all with the goal of improving model training for leaf disease detection in real field environments. However, these datasets are primarily focused on diseases of leaves rather than fruits. Although a few datasets [10,11] have attempted to focus on brinjal (eggplant) fruit diseases, they remain highly limited in class diversity. Several important and economically significant diseases are either underrepresented or entirely absent, reducing the usefulness of these datasets for real-world disease detection. Moreover, to the best of our knowledge, no comprehensive research paper specifically addressing brinjal fruit disease dataset is currently available. This highlights a clear research gap and underscores the need for a more complete and publicly accessible dataset dedicated to brinjal fruit diseases.Table 3. Comparison of existing works on brinjal.Table 3 dummy alt textReferencesYearDisease TypeClassContribution[5]2025Leaf Disease6Large-scale leaf disease dataset under natural field conditions[6]2024Leaf Disease11Focus on classification with balanced leaf disease categories[7]2023Leaf Disease7Leaf disease images for improved model training[8]2024Leaf Disease10Comprehensive dataset supporting multi leaf disease detection[9]2025Leaf Disease6Extensive image set focusing on real-world application scenarios[10]2022Fruit Disease4Image Dataset on Brinjal Fruit Disease[11]2022Fruit Disease4Image Dataset on Brinjal Fruit DiseaseOur Dataset2025Fruit Disease5This is the first known collection to feature 1823 distinct and non-duplicate images of brinjal fruit diseases, which can support early detection of fruit infections.
On the other hand, our dataset developed in 2025 is the first one that is specifically known to be solely dedicated to brinjal fruit diseases with 5 classes. It comprises 1823 non-redundant and unique images representing five significant fruit diseases of brinjal. This dataset addresses a significant research gap of existing literature since it is the first one to aim for fruit-related disorders such as Phomopsis Blight, Fruit and Shoot Borer, Fruit Cracking, Wet Rot, and Healthy Brinjal. The novelty of our dataset lies in its one-pointed focus on fruit diseases, with different disease classes and the ability to support early detection of fruit infection, hence making it an invaluable resource towards enhancing both research and application in agricultural disease control. Although the dataset includes five major brinjal fruit diseases, several additional diseases commonly found in Asian growing conditions such as Alternaria Fruit Rot and Fusarium disorders were not visible during the data collection period. As a result, these categories could not be included.
Experimental Design, Materials and Methods
4
The research design for the brinjal dataset followed a structured approach, beginning with the capture of fruit images using two mobile cameras to introduce diversity in lighting, angles, and image quality. The dataset includes five main categories: Phomopsis Blight, Fruit and Shoot Borer, Brinjal Fruit Cracking, Wet Rot, and Healthy Brinjal. Agricultural experts validated the dataset to ensure the accuracy of classifications. The process of dataset preparation and classification, leading to integration into an expert system, is illustrated in Fig. 2.Fig. 2. Step by step working procedure of fruits and its condition classification.Fig 2 dummy alt text
Materials and equipment’s specification
4.1
The data were collected over a period of days, January to February 2025, by two individuals from two different regions of Bangladesh, Bogura and Dhaka. The data were taken under different weather conditions of sunny, foggy, and cloud weather with temperature ranging from 18 °C to 24 °C and humidity ranging from 65 % to 80 % shows in Table 4. The images were captured using two mobile camera gadgets, which are Poco F3 with 48MP and Samsung A52 with 64MP. Collection of data employed mixed use of the devices according to availability and weather. For example, on 10th Jan 2025, the pictures were taken with a mix of 60 % Poco F3 and 40 % Samsung A52 in Bogura, and on 3rd Feb 2025, all the pictures were taken using the Poco F3 (100 %) in Dhaka. This heterogeneity of strategy, combined with the heterogeneity of lighting and environmental conditions, ensures that the dataset captures a wide range of image qualities and viewpoints and is therefore more robust to train and test models.Table 4. Details of camera specification.Table 4 dummy alt textDateWeatherTimeTemp ( °C)Humidity ( %)Camera DeviceLocation10 Jan 2025SunnyMorning1875Poco F3 (60 %), Samsung A52 (40 %)Bogura15 Jan 2025FoggyAfternoon2080Samsung A52 (100 %)Dhaka20 Jan 2025CloudyNoon2270Poco F3 (50 %), Samsung A52 (50 %)Dhaka23 Jan 2025SunnyMorning1978Poco F3 (100 %)Bogura30 Jan 2025CloudyAfternoon2172Samsung A52 (70 %), Samsung A52 (30 %)Dhaka3 Feb 2025SunnyNoon2465Poco F3 (100 %)Dhaka5 Feb 2025CloudyMorning2368Samsung A52 (100 %)Bogura
Geographical location
4.2
The fruit image dataset was systematically collected through extensive field surveys conducted in Bangladesh between 10 January 2025 to February 5, 2025. The selection of collection sites was strategically planned to encompass diverse geographical, environmental, and agricultural conditions, ensuring a well-balanced and representative dataset. The geographic locations of the brinjal data collection sites are as follows:
- •Bogura, Bangladesh, Latitude: 24.8483° N, Longitude: 89.2220° E, Zone: Rajshahi, Country: Bangladesh
- •Dhaka, Bangladesh, Latitude: 23.8103° N, Longitude: 90.3200° E Zone: Dhaka, Country: Bangladesh
Data validation and annotation
4.3
The data validation and annotation process involved thorough inspection and verification by agricultural experts to ensure the accuracy of fruit classifications into five distinct categories: Healthy Brinjal, Phomopsis Blight, Fruit and Shoot Borer, Fruit Cracking, and Wet Rot. Expert agronomists carefully assessed the images to confirm the correct categorization of each fruit, considering various disease symptoms and fruit conditions. After capturing the images, experts conducted field inspections to verify the presence of specific disease traits. The validation process adhered to a systematic approach that guaranteed the dataset's reliability and accuracy for machine learning applications. Disease labels in this dataset were assigned based on field-level visual symptoms only, without laboratory or molecular confirmation. All visual diagnoses were cross-checked by Professor Dr. M. A. Rahim (Head, Department of Agricultural Science, DIU) to improve reliability, but the dataset should be considered visually identified rather than pathogen-verified.
Preprocessing
4.4
To preprocess the dataset, we employed duplicate image removal techniques (shows in Algorithms 1) to ensure the dataset had only unique, high-quality images. Since the dataset was captured by two individuals from different regions and under varying conditions, there was a possibility of duplicate images being captured. For this reason, we employed an image deduplication process using perceptual hashing. A similarity threshold of 0.90 was applied, and 73 duplicate images were eliminated. This process involved converting images into grayscale format, resizing them to the same size, and computing a perceptual hash value of every image. Hashing these values one against the other helped us identify and remove duplicate images and thus make sure that unique samples only were counted in the dataset. After the removal of duplicates, the remaining images were annotated thoroughly in accordance with visual verification and expert labelling. It maintained uniformity in the dataset by categorizing each of the images into one of the five classes of diseases: Healthy Brinjal, Phomopsis Blight, Fruit and Shoot Borer, Fruit Cracking, and Wet Rot. This preprocessing step was done to eliminate redundancy, improve dataset quality, and make the images used for model training diverse and reflective of real conditions.Algorithm 1Duplicate value remove algorithms.Algorithm 1 dummy alt textAlgorithm 1 Image Deduplication Using Perceptual HashingInput:Input: Image collection ImgSet = {img₁, img₂, …, img_n_} containing n imagesOutput:Filtered collection UniqueImgs containing only distinct imagesStage −1:Initialize an empty set for storing hash values: HashRegistry ← {}Stage −2:Initialize an empty list for storing unique images: UniqueImgs ← {}Stage −3:**FOR each photo in ImgSet:**Stage −4: Convert photo to grayscale formatStage −5: Resize photo to a uniform size (height, width)Stage −6: Compute perceptual hash value hashVal using the pHash algorithmStage −7: **IF hashVal is not in HashRegistry:**Stage −8: Add hashVal to HashRegistryStage −9: Append photo to UniqueImgsStage −10: **ELSE:**Stage −11: Label photo as a duplicate (do not add to output)Stage −12: END IFStage −13:End FORStage −14:Return Unique Imgs as the deduplicated image set
Model implementation
4.5
We employed several machine learning models like ResNet50, DenseNet121, MobileNetV2, EfficientNetB0, and VGG16 to evaluate the performance of the brinjal fruit disease dataset. Preprocessing involved a series of steps that prepare the dataset to be used for model training, which included resizing the images into a uniform dimension, employing data augmentation techniques to enhance dataset diversity, and performing a train-test split to ensure it's being tested in the correct way. After model training, from Table 5, we observed the following results: MobileNetV2 achieved a peak accuracy of 93.98 %, followed by the remaining models, which all achieved high accuracies, with ResNet50 achieving 93.88 %, DenseNet121 achieving 93.78 %, and EfficientNetB0 achieving 93.89 %. Although the VGG16 model achieved slightly lower, it achieved an accuracy of 93.22 %. Precision, recall, and F1 measures were all strong across all the models and therefore confirm that they can be used to classify the various classes of fruit diseases—Healthy Brinjal, Phomopsis Blight, Fruit and Shoot Borer, Fruit Cracking, and Wet Rot. The results confirm that the dataset can be utilized for training deep learning models and therefore can be utilized to develop automated systems for detecting diseases in brinjal crops. The complete code, along with augmentation scripts and model development, is publicly available in our GitHub repository [12].Table 5. Performance of applied deep learning models.Table 5 dummy alt textModelClassAccuracyPrecisionRecallF1 scoreResNet50Shoot and Fruit Borer (SFB)0.93880.890870.88Wet Rot (WR)0.960.970.97Brinjal Fruit Cracking (BFC)0.970.970.97Phomopsis Blight (PB)0.940.940.94Healthy0.940.940.94DenseNet121Shoot and Fruit Borer (SFB)0.93780.850.890.87Wet Rot (WR)0.970.990.98Brinjal Fruit Cracking (BFC)0.960.970.96Phomopsis Blight (PB)0.950.940.95Healthy0.960.900.93MobilNetV2Shoot and Fruit Borer (SFB)0.93980.900.880.89Wet Rot (WR)0.960.990.98Brinjal Fruit Cracking (BFC)0.950.990.97Phomopsis Blight (PB)0.950.920.94Healthy0.960.940.95EfficientNetB0Shoot and Fruit Borer (SFB)0.93890.900.880.89Wet Rot (WR)0.960.990.98Brinjal Fruit Cracking (BFC)0.950.990.97Phomopsis Blight (PB)0.950.920.94Healthy0.960.940.95VGG16Shoot and Fruit Borer (SFB)0.93220.850.850.85Wet Rot (WR)1.000.970.98Brinjal Fruit Cracking (BFC)0.950.970.96Phomopsis Blight (PB)0.900.980.94Healthy0.960.890.92
Aside from its value for research, this information also opens huge opportunities for real-world applications, particularly in tracking supply chains and ensuring food security. In cold-chain supply, it can be used on installed IoT systems, including sensors and cameras with machine learning software in a bid to enable real-time fruit classification. These systems can be employed to monitor vital environmental parameters, such as temperature, humidity, and gas concentration, across fruit transportation and storage so that spoilage, contamination, or formalin adulteration can be detected early. Formalin-mixed fruit samples included in the dataset provide added value to ensure it can be utilized in regulatory inspections for curbing chemical adulteration. Additionally, the data set can be applied in smart inspection systems to guarantee quality, reduce food waste, and enable cloud-based systems to notify stakeholders of any quality deviations. To the consumer, the data set enables the development of mobile applications for assessing fruit quality via smartphone cameras, promoting responsible purchasing and public health awareness, particularly where food adulteration is high. Thus, the dataset not only aids in improving food quality and food safety but also plays a key role in supply chain transparency that works for farmers and consumers.
Limitations
The images were gathered from a few selected regions in Bangladesh, which may reduce the generalizability of the dataset when applied to different environmental or agricultural settings. A noticeable imbalance in the number of images across disease classes is present, particularly for less common diseases like Rhizopus and Wet Rot. With over 700 images of Shoot and Fruit Borer compared to only 161 images of Phomopsis Blight. This imbalance may affect model fairness and benchmarking reliability. To address this, researchers may apply class-balancing techniques such as SMOTE, ADASYN, or targeted data augmentation where an augmentation code is available in our code repository [12]. Another limitation is that only visible disease symptoms were documented. Internal or early-stage signs that are not externally apparent were not captured, which may reduce the model’s usefulness for early detection. Moreover, the absence of temporal data restricts the ability to analyze fruit spoilage progression over time, which could be valuable for time-series-based models in food quality monitoring*.*
Ethics Statement
We confirm that our study followed all ethical guidelines. No plants, animals, or people were harmed during the research. Also, we did not collect any data from social media. All authors agree to follow the ethical rules needed for publishing in Data in Brief.
CRediT Author Statement
Md. Zahid Hasan: Data Collection, Data Curation, Conceptualization, Writing – Review & Editing; Abu Kowshir Bitto: Data Collection, Data Curation, Conceptualization, Methodology, Visualization, Code Development, Writing; Md. Hasan Imam Bijoy: Data Collection, Data Curation, Methodology, Visualization, Validation, Writing; Khalid Been Badruzzaman Biplob: Data Curation, Code Development; Mohammad Mahadi Hassan: Funding*,* Writing – Review & Editing; Mohammad Shohel Rana: Funding*,* Writing – Review & Editing; Abdul Kadar Muhammad Masum: Funding*,* Writing – Review & Editing.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Rohini M.Jayapala N.Pushpalatha H.G.Gavirangappa H.Puttaswamy H.Ramachandrappa N.S.Biochemical, pathological and molecular characterisation of Phomopsis vexans: a causative of leaf blight and fruit rot in brinjal Microb. Pathog.179Jun. 202310611410.1016/J.MICPATH.2023.10611437060966 · doi ↗ · pubmed ↗
- 2Thomas G.Thakur V.Sharma S.Azizi S.Strategic implementation of conventional and advanced approaches to combat brinjal shoot and fruit borer Discov. Appl. Sci.74Apr. 202511710.1007/S 42452-025-06670-6 · doi ↗
- 3Kanchagar L.Kalmath B.S.Nadagouda S.Haveri R.Narayan J.Studies on improvement of semi-synthetic diet for mass rearing of brinjal shoot and fruit borer, Leucinodes orbonalis Guenee (Lepidoptera: crambidae) in laboratory Int. J. Trop. Insect. Sci.452Apr. 202586988110.1007/S 42690-025-01476-W · doi ↗
- 4AS.H.Nirmalnath P..J.MS.G.Sreenivasa M..N.Hegde R.V.Hegde Y.Bioefficacy of indigenous AM fungi against sclerotium rolfsii and their impact on the growth enhancement of brinjal Indian Phytopathol.782Jun. 202526927710.1007/S 42360-025-00839-0 · doi ↗
- 5M.I.R. Rafe, F.M. Nayem, S.B. Sarker, and A. al Shiam, “Eggplant_Leaf_Disease_Dataset,” vol. 2, 2025, doi: 10.17632/MN 8VFR 9BW 2.2.
- 6R.H. Rabbi, N.A.S. Rachin, L.R. Tanvir, and M.U. Mojumdar, “Eggplant leaf disease classification dataset,” vol. 2, 2024, doi: 10.17632/PVSV 534CCG.2.
- 7M.M.H.M. Mafi, Mst.A.A. Ava, Md.G.M. Moazzam, and M.S. Uddin, “Eggplant disease recognition dataset,” vol. 2, 2023, doi: 10.17632/R 3TB 5MZN 4D.2.
- 8M.A.S. Nirob, P. Bishshash, M. bin Ayan, T. Khatun, and M.S. Uddin, “Eggplant dataset: a comprehensive dataset for agricultural research and disease detection,” vol. 1, 2024, doi: 10.17632/5DRKK 544K 8.1.
