Multi-Task Learning for Calorie Prediction on a Novel Large-Scale Recipe Dataset Enriched with Nutritional Information
Robin Ruede, Verena Heusser, Lukas Frank, Alina Roitberg, Monica, Haurilet, Rainer Stiefelhagen

TL;DR
This paper introduces a large-scale dataset and a multi-task learning approach to estimate meal calories from images by leveraging recipe data and nutritional information, outperforming single-task models.
Contribution
It presents the pic2kcal dataset and demonstrates the effectiveness of multi-task learning for calorie prediction from food images.
Findings
Multi-task learning improves calorie estimation accuracy by 9.9%.
The dataset contains 308,000 images and 70,000 recipes.
Models predict calories, nutrients, and ingredients simultaneously.
Abstract
A rapidly growing amount of content posted online, such as food recipes, opens doors to new exciting applications at the intersection of vision and language. In this work, we aim to estimate the calorie amount of a meal directly from an image by learning from recipes people have published on the Internet, thus skipping time-consuming manual data annotation. Since there are few large-scale publicly available datasets captured in unconstrained environments, we propose the pic2kcal benchmark comprising 308,000 images from over 70,000 recipes including photographs, ingredients and instructions. To obtain nutritional information of the ingredients and automatically determine the ground-truth calorie value, we match the items in the recipes with structured information from a food item database. We evaluate various neural networks for regression of the calorie quantity and extend them with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
