FoodBD: a polygon-annotated meal image dataset of Bangladeshi cuisines with visual and nutritional labels
Benzir Md Ahmed, Md. Enamul Haque, A. K. Obidul Huq, Mohammad Mehedy Masud, Mohammed Eunus Ali, Mahmuda Naznin

TL;DR
FoodBD is a dataset of Bangladeshi meal images with detailed annotations and nutritional data to support AI tools for dietary assessment and health monitoring.
Contribution
The dataset introduces a culturally diverse, polygon-annotated resource for Bangladeshi cuisine with nutritional labels.
Findings
The dataset includes 3,523 smartphone-captured meal images with minimal preprocessing.
1,837 images are annotated with expert-estimated nutritional information across six categories.
The dataset is split into training, validation, and test subsets for reproducible machine learning experiments.
Abstract
The FoodBD dataset was initially collected to address the dietary assessment of diabetic patients. However, it was later expanded to address the lack of culturally diverse food image datasets, particularly for Bangladeshi cuisine, which is underrepresented in food recognition research. It supports tasks in computer vision, nutrition estimation, and health monitoring by providing a resource for AI-driven dietary assessment tools. FoodBD comprises 3,523 smartphone-captured meal images representing authentic Bangladeshi meals, with minimal preprocessing to preserve real-world complexity. Each image is annotated with polygon-based segmentation across 67 food categories. Additionally, among them 1,837 images include expert-estimated nutritional information (carbohydrate, protein, fat, fiber, calorie, and glycemic load). The dataset is split into training, validation, and test subsets,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
- —https://doi.org/10.13039/100019458United International University
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNutritional Studies and Diet · Agriculture Sustainability and Environmental Impact · Culinary Culture and Tourism
Objective
Accurate dietary assessment plays a critical role in health monitoring, chronic disease management, and nutritional research. In recent years, the use of image-based meal tracking and food recognition has gained traction because of the increasing availability of mobile cameras and advances in computer vision. However, the effectiveness of such AI-driven systems is highly dependent on the quality and diversity of the training data they use. Most publicly available food image datasets feature primarily Western [1, 2], East Asian [3–5], or mixed cultural [6–8] cuisines. This limits their applicability in regions such as Bangladesh, where food preparation, ingredients, and presentation differ significantly from those of the above mentioned cuisines. The FoodBD dataset was developed to address this challenge.
The primary objective of the FoodBD dataset is to support dietary assessment for diabetic patients in Bangladesh. This goal was later broadened to develop a high-quality, culturally contextualized dataset of Bangladeshi meals, enabling applications in both computer vision and health research. This dataset aims to help develop multimodal AI models for nutrition estimation, dietary monitoring, and chronic disease management. It was not previously published as part of a research paper, as the focus was specifically on dataset collection, annotation, and availability for reuse.
Data description
The dataset was collected from multiple locations across Dhaka, Bangladesh. The dataset contains a total of 3,523 smartphone-captured meal images collected under real-world conditions, covering breakfast, lunch, and dinner. These images were captured by participants using their mobile devices, resulting in natural variability in resolution, lighting, and angles. To ensure usability, the images were standardized by resizing to 1024 × 1024 pixels using Roboflow’s [9] “Fit (black edges)” option, which preserves the original aspect ratio without cropping or distortion.
Each meal image is accompanied by detailed polygon-based annotations representing food items. All annotations were performed manually using the LabelMe [10] tool, following a standardized annotation protocol to ensure consistency in polygon boundaries and labeling style. To verify the quality of the annotations, multiple validation steps were implemented. First, faculty members from the Department of Food Technology and Nutritional Science reviewed each batch of annotations and provided domain-specific feedback on food category distinctions and portion boundaries. Second, a random subset of approximately 30% of the annotated images was cross-checked by a separate annotator to confirm boundary precision and class correctness. Finally, discrepancies identified during this review were resolved through discussion and expert input, ensuring high inter-annotator agreement. This multi-step validation process strengthened the consistency and reliability of the polygon annotations across the dataset.
A total of 12,575 annotations span 67 food classes, including both common staples (such as rice, ruti, fish, chicken, and lentils) and less frequently consumed items (such as mango, guava, and sweets). Food labels were recorded in both Bangla and English, ensuring cultural relevance while remaining accessible to an international research audience. The annotation files are stored in plain-text format, with each line representing one food item through a class ID followed by polygon coordinates (x, y) (see Dataset 2). A representative example of this file is also provided as a PNG image (see Data file 4) in the metadata for illustration purposes. Data file 3 illustrates the folder structure, whereas Data file 5 depicts the data collection and annotation workflow. Examples of original and annotated images are presented in Data file 7, which provides a visual result of the annotation process.
In addition to visual annotations, 1,837 images were enriched with nutritional metadata. These annotations were provided by a nutrition expert who carefully reviewed each meal image and (i) estimated the portion sizes of visible ingredients on the basis of dietary guidelines and professional expertise; ii) assigned nutritional values per serving covering carbohydrate (CHO), protein, fat, fiber, and calorie using standardized data sources [11, 12]; iii) obtained the glycemic index (GI) of individual ingredients from [13]; iv) calculated the glycemic load (GL) following the formula GL = (GI × weight)/100, as outlined in [14], to provide a more accurate indicator, as it incorporates both the GI and the estimated weight (in grams) of each ingredient, and v) finally aggregated ingredient-level estimates to yield total CHO, protein, fat, fiber, and calorie values per image.
Standard serving guidelines specific to the Bangladeshi population were followed to ensure consistency in portion size estimation. These guidelines are based on indigenous household utensils and traditional food preparation practices, as first recommended by Ali et al. [15] and later endorsed in the book chapter “Serving Size and Food Exchange List” supported by the Ministry of Planning, Government of the People’s Republic of Bangladesh [12]. In addition to these standardized references, common household measures and visual cues (e.g., standardized plates, bowls, and spoons) were used during the annotation process to estimate portion sizes as accurately as possible. These details are intended to make the procedure transparent and reproducible for other researchers. The dual nature of FoodBD, as both an annotated image dataset and a nutritional dataset, supports a wide range of downstream applications, including calorie estimation, dietary monitoring, diabetes management, and AI-powered nutrition recommendation systems.
The dataset has been systematically organized into training (70%), validation (20%), and test (10%) splits to support standardized machine learning workflows. Each subset is placed in a dedicated directory with separate folders for images and annotations. A consistent naming convention, FoodBD-XXXX, ensures traceability between images and their annotation files. Metadata describing food instances, environmental conditions, and preprocessing details are compiled in Data file 1. The nutritional annotations are stored in Data file 2, which contains detailed macronutrient and GL values for each meal image.
A summary of all dataset components and figures is presented in Table 1. This table lists the images, annotation files, metadata, nutritional files, and supporting figures, enabling researchers to easily locate the relevant resources in the repository.
Table 1. Overview of data files/datasetsLabelName of data file/datasetFile types(file extension)Data repository and identifier (DOI or accession number)Dataset 1FoodBD meal images.jpgMendeley (10.17632/xh3ghf3jbg.2) [16]Dataset 2Polygon-based annotations.txtMendeley (10.17632/xh3ghf3jbg.2) [16]Data file 1FoodBD_Meta_data.csv.csvMendeley (10.17632/xh3ghf3jbg.2) [16]Data file 2MealNutrition1837_All.csv.csvMendeley (10.17632/xh3ghf3jbg.2) [16]Data file 3Dataset_Folder_Structure.pngMendeley (10.17632/xh3ghf3jbg.2) [16]Data file 4Single_Meal_Annotation_File.pngMendeley (10.17632/xh3ghf3jbg.2) [16]Data file 5Data_Collection_Pipeline.pngMendeley (10.17632/xh3ghf3jbg.2) [16]Data file 6Class_Distribution_Chart.pngMendeley (10.17632/xh3ghf3jbg.2) [16]Data file 7Original_vs_Annotated_Images.pngMendeley (10.17632/xh3ghf3jbg.2) [16]
Limitations
- Class imbalance, with some foods (e.g., rice, ruti, and chicken) overrepresented whereas others (e.g., guava, mango, and roll) are underrepresented (see Data file 6).
- Variability in image quality due to the use of different mobile devices, lighting, and angles.
- Manual annotations may contain minor human errors or subjective interpretations of food boundaries.
- In this dataset, glycemic load (GL) was calculated using the total weight of the food item rather than the grams of available carbohydrate. While this provides an approximate GL estimate, it may differ from values derived from available carbohydrate content, which is the conventional method in nutritional research. This limitation should be considered when interpreting or using the GL values in further analyses.
To address class imbalance, future users may apply data augmentation techniques such as horizontal or vertical flipping, rotation, color jittering, and brightness normalization, as well as balancing strategies like oversampling or weighted loss functions. Additionally, users may filter images based on resolution or lighting conditions to mitigate quality variations. Incorporating such preprocessing steps can help ensure more robust model training and improve the overall performance of subsequent analyses.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Dwyer B, Nelson J, Hansen T, et al. 2025. Roboflow (Version 1.0) [Software]. Computer vision. https://roboflow.com.
