Food Image Generation on Multi-Noun Categories

Xinyue Pan; Yuhao Chen; Jiangpeng He; Fengqing Zhu

arXiv:2512.09095·cs.CV·December 11, 2025

Food Image Generation on Multi-Noun Categories

Xinyue Pan, Yuhao Chen, Jiangpeng He, Fengqing Zhu

PDF

Open Access

TL;DR

This paper addresses the challenge of generating accurate food images for multi-noun categories by introducing a novel method that incorporates domain knowledge and layout refinement, resulting in improved generation quality.

Contribution

The paper proposes FoCULR, a method that enhances food image generation by integrating food domain knowledge and layout refinement to better handle multi-noun categories.

Findings

01

Improved accuracy in generating multi-noun food images.

02

Enhanced spatial layout correctness in generated images.

03

Better alignment with real-world food dataset distributions.

Abstract

Generating realistic food images for categories with multiple nouns is surprisingly challenging. For instance, the prompt "egg noodle" may result in images that incorrectly contain both eggs and noodles as separate entities. Multi-noun food categories are common in real-world datasets and account for a large portion of entries in benchmarks such as UEC-256. These compound names often cause generative models to misinterpret the semantics, producing unintended ingredients or objects. This is due to insufficient multi-noun category related knowledge in the text encoder and misinterpretation of multi-noun relationships, leading to incorrect spatial layouts. To overcome these challenges, we propose FoCULR (Food Category Understanding and Layout Refinement) which incorporates food domain knowledge and introduces core concepts early in the generation process. Experimental results demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Nutritional Studies and Diet