CookingDiffusion: Cooking Procedural Image Generation with Stable   Diffusion

Yuan Wang; Bin Zhu; Yanbin Hao; Chong-Wah Ngo; Yi Tan; Xiang Wang

arXiv:2501.09042·cs.CV·February 11, 2025

CookingDiffusion: Cooking Procedural Image Generation with Stable Diffusion

Yuan Wang, Bin Zhu, Yanbin Hao, Chong-Wah Ngo, Yi Tan, Xiang Wang

PDF

Open Access

TL;DR

CookingDiffusion introduces a novel method for generating sequential, photo-realistic cooking images aligned with recipe steps, enhancing visual cooking guidance and simulation capabilities.

Contribution

The paper presents CookingDiffusion, a new approach leveraging Stable Diffusion and Memory Nets to generate consistent, high-quality procedural images from cooking recipes and steps.

Findings

01

Achieves high-quality, sequential cooking images with consistency metrics

02

Outperforms baselines in FID and procedure consistency

03

Demonstrates ability to manipulate ingredients and cooking methods

Abstract

Recent advancements in text-to-image generation models have excelled in creating diverse and realistic images. This success extends to food imagery, where various conditional inputs like cooking styles, ingredients, and recipes are utilized. However, a yet-unexplored challenge is generating a sequence of procedural images based on cooking steps from a recipe. This could enhance the cooking experience with visual guidance and possibly lead to an intelligent cooking simulation system. To fill this gap, we introduce a novel task called \textbf{cooking procedural image generation}. This task is inherently demanding, as it strives to create photo-realistic images that align with cooking steps while preserving sequential consistency. To collectively tackle these challenges, we present \textbf{CookingDiffusion}, a novel approach that leverages Stable Diffusion and three innovative Memory Nets…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques

MethodsDiffusion · ALIGN