Multimodal Fusion of Glucose Monitoring and Food Imagery for Caloric Content Prediction
Adarsh Kumar

TL;DR
This paper presents a multimodal deep learning framework that combines glucose monitoring, demographic, microbiome, and food imagery data to improve the accuracy of caloric content prediction for dietary management in diabetes.
Contribution
The study introduces a novel multimodal fusion approach that integrates physiological, demographic, microbiome, and visual data for enhanced caloric estimation accuracy.
Findings
Achieved RMSRE of 0.2544, over 50% better than baselines.
Demonstrated the effectiveness of multimodal data fusion in dietary assessment.
Improved caloric prediction accuracy for personalized diabetes management.
Abstract
Effective dietary monitoring is critical for managing Type 2 diabetes, yet accurately estimating caloric intake remains a major challenge. While continuous glucose monitors (CGMs) offer valuable physiological data, they often fall short in capturing the full nutritional profile of meals due to inter-individual and meal-specific variability. In this work, we introduce a multimodal deep learning framework that jointly leverages CGM time-series data, Demographic/Microbiome, and pre-meal food images to enhance caloric estimation. Our model utilizes attention based encoding and a convolutional feature extraction for meal imagery, multi-layer perceptrons for CGM and Microbiome data followed by a late fusion strategy for joint reasoning. We evaluate our approach on a curated dataset of over 40 participants, incorporating synchronized CGM, Demographic and Microbiome data and meal photographs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Chemical Sensor Technologies
MethodsSoftmax · Attention Is All You Need
