An Improved Encoder-Decoder Framework for Food Energy Estimation
Jack Ma, Jiangpeng He, Fengqing Zhu

TL;DR
This paper presents an improved encoder-decoder framework for estimating food energy from single images, leveraging a high-quality dataset and achieving over 10% better accuracy than previous methods.
Contribution
The authors introduce a novel encoder-decoder approach and a verified food image dataset for more accurate caloric estimation from images.
Findings
Over 10% improvement in MAPE over previous methods
Reduction of 30 kcal in MAE compared to prior work
High-quality dataset verified by dietitians
Abstract
Dietary assessment is essential to maintaining a healthy lifestyle. Automatic image-based dietary assessment is a growing field of research due to the increasing prevalence of image capturing devices (e.g. mobile phones). In this work, we estimate food energy from a single monocular image, a difficult task due to the limited hard-to-extract amount of energy information present in an image. To do so, we employ an improved encoder-decoder framework for energy estimation; the encoder transforms the image into a representation embedded with food energy information in an easier-to-extract format, which the decoder then extracts the energy information from. To implement our method, we compile a high-quality food image dataset verified by registered dietitians containing eating scene images, food-item segmentation masks, and ground truth calorie values. Our method improves upon previous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNutritional Studies and Diet · Diet and metabolism studies · Nutrition, Genetics, and Disease
MethodsMasked autoencoder
