CaLoRAify: Calorie Estimation with Visual-Text Pairing and LoRA-Driven   Visual Language Models

Dongyu Yao; Keling Yao; Junhong Zhou; Yinghao Zhang

arXiv:2412.09936·cs.CV·December 16, 2024

CaLoRAify: Calorie Estimation with Visual-Text Pairing and LoRA-Driven Visual Language Models

Dongyu Yao, Keling Yao, Junhong Zhou, Yinghao Zhang

PDF

1 Repo

TL;DR

CaLoRAify is a vision-language model framework that uses a large curated dataset and LoRA techniques to accurately estimate calories from food images while supporting conversational interactions.

Contribution

The paper introduces CaLoRAify, a novel VLM framework utilizing LoRA and RAG techniques with a new dataset for improved calorie estimation from food images.

Findings

01

Achieved accurate calorie estimation from monocular food images.

02

Enhanced VLM performance in calorie estimation with LoRA and RAG.

03

Open-sourced code and dataset for community use.

Abstract

The obesity phenomenon, known as the heavy issue, is a leading cause of preventable chronic diseases worldwide. Traditional calorie estimation tools often rely on specific data formats or complex pipelines, limiting their practicality in real-world scenarios. Recently, vision-language models (VLMs) have excelled in understanding real-world contexts and enabling conversational interactions, making them ideal for downstream tasks such as ingredient analysis. However, applying VLMs to calorie estimation requires domain-specific data and alignment strategies. To this end, we curated CalData, a 330K image-text pair dataset tailored for ingredient recognition and calorie estimation, combining a large-scale recipe dataset with detailed nutritional instructions for robust vision-language training. Built upon this dataset, we present CaLoRAify, a novel VLM framework aligning ingredient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kennyyao2001/16824-caloraify
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.