OmniFood8K: Single-Image Nutrition Estimation via Hierarchical Frequency-Aligned Fusion

Dongjian Yu; Weiqing Min; Qian Jiang; Xing Lin; Xin Jin; Shuqiang Jiang

arXiv:2604.12356·cs.CV·April 15, 2026

OmniFood8K: Single-Image Nutrition Estimation via Hierarchical Frequency-Aligned Fusion

Dongjian Yu, Weiqing Min, Qian Jiang, Xing Lin, Xin Jin, Shuqiang Jiang

PDF

1 Repo

TL;DR

This paper introduces OmniFood8K, a large multimodal dataset for Chinese food nutrition estimation, and proposes a novel RGB-to-nutrition framework utilizing depth prediction and frequency domain feature fusion.

Contribution

It provides a new comprehensive dataset for Chinese cuisine and develops an end-to-end RGB-based nutrition prediction model with innovative frequency-aligned feature fusion.

Findings

01

Our method outperforms existing approaches on multiple datasets.

02

The hierarchical frequency-aligned fusion improves feature representation.

03

The synthetic dataset enhances model robustness and generalization.

Abstract

Accurate estimation of food nutrition plays a vital role in promoting healthy dietary habits and personalized diet management. Most existing food datasets primarily focus on Western cuisines and lack sufficient coverage of Chinese dishes, which restricts accurate nutritional estimation for Chinese meals. Moreover, many state-of-the-art nutrition prediction methods rely on depth sensors, restricting their applicability in daily scenarios. To address these limitations, we introduce OmniFood8K, a comprehensive multimodal dataset comprising 8,036 food samples, each with detailed nutritional annotations and multi-view images. In addition, to enhance models' capability in nutritional prediction, we construct NutritionSynth-115K, a large-scale synthetic dataset that introduces compositional variations while preserving precise nutritional labels. Moreover, we propose an end-to-end framework for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://yudongjian.github.io/OmniFood8K-food
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.