A Large-Scale Benchmark for Food Image Segmentation
Xiongwei Wu, Xin Fu, Ying Liu, Ee-Peng Lim, Steven C.H. Hoi, Qianru, Sun

TL;DR
This paper introduces a large, detailed food image dataset and a multi-modality pre-training approach to improve food image segmentation, aiming to advance health-related applications and facilitate future research.
Contribution
It provides a new high-quality, large-scale food image dataset with detailed ingredient labels and masks, along with a novel pre-training method ReLeM for better segmentation performance.
Findings
ReLeM improves segmentation accuracy over baseline models
FoodSeg103 and FoodSeg154 serve as new benchmarks
Public datasets and models support future research
Abstract
Food image segmentation is a critical and indispensible task for developing health-related applications such as estimating food calories and nutrients. Existing food image segmentation models are underperforming due to two reasons: (1) there is a lack of high quality food image datasets with fine-grained ingredient labels and pixel-wise location masks -- the existing datasets either carry coarse ingredient labels or are small in size; and (2) the complex appearance of food makes it difficult to localize and recognize ingredients in food images, e.g., the ingredients may overlap one another in the same image, and the identical ingredient may appear distinctly in different food images. In this work, we build a new food image dataset FoodSeg103 (and its extension FoodSeg154) containing 9,490 images. We annotate these images with 154 ingredient classes and each image has an average of 6…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Chemical Sensor Technologies · Nutritional Studies and Diet · Identification and Quantification in Food
MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dropout · Dense Connections · Adam · Vision Transformer · Layer Normalization · Softmax
