VolETA: One- and Few-shot Food Volume Estimation
Ahmad AlMughrabi, Umair Haroon, Ricardo Marques, Petia Radeva

TL;DR
VolETA introduces a novel 3D generative approach for accurate food volume estimation from minimal RGBD images, improving dietary assessment tools with high precision and robustness.
Contribution
This paper presents a new methodology combining 3D mesh reconstruction, scaling, and deep learning models for one- and few-shot food volume estimation, addressing occlusions and complex geometries.
Findings
Achieves 10.97% MAPE on the MTF dataset
Effectively handles occlusions and lighting variations
Provides robust volume estimates for complex foods
Abstract
Accurate food volume estimation is essential for dietary assessment, nutritional tracking, and portion control applications. We present VolETA, a sophisticated methodology for estimating food volume using 3D generative techniques. Our approach creates a scaled 3D mesh of food objects using one- or few-RGBD images. We start by selecting keyframes based on the RGB images and then segmenting the reference object in the RGB images using XMem++. Simultaneously, camera positions are estimated and refined using the PixSfM technique. The segmented food images, reference objects, and camera poses are combined to form a data model suitable for NeuS2. Independent mesh reconstructions for reference and food objects are carried out, with scaling factors determined using MeshLab based on the reference object. Moreover, depth information is used to fine-tune the scaling factors by estimating the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNutritional Studies and Diet
