FoodMem: Near Real-time and Precise Food Video Segmentation
Ahmad AlMughrabi, Adri\'an Gal\'an, Ricardo Marques, Petia Radeva

TL;DR
FoodMem is a novel framework that achieves near-real-time, precise food video segmentation and tracking, significantly improving accuracy and speed over existing models using minimal hardware resources.
Contribution
The paper introduces FoodMem, a two-phase transformer and memory-based framework for high-quality food segmentation in videos, outperforming current models in accuracy and speed.
Findings
FoodMem improves segmentation accuracy by 2.5% mAP.
FoodMem is 58 times faster than existing models.
The framework performs well across diverse challenging scenarios.
Abstract
Food segmentation, including in videos, is vital for addressing real-world health, agriculture, and food biotechnology issues. Current limitations lead to inaccurate nutritional analysis, inefficient crop management, and suboptimal food processing, impacting food security and public health. Improving segmentation techniques can enhance dietary assessments, agricultural productivity, and the food production process. This study introduces the development of a robust framework for high-quality, near-real-time segmentation and tracking of food items in videos, using minimal hardware resources. We present FoodMem, a novel framework designed to segment food items from video sequences of 360-degree unbounded scenes. FoodMem can consistently generate masks of food portions in a video sequence, overcoming the limitations of existing semantic segmentation models, such as flickering and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCulinary Culture and Tourism
