Retrieval Augmented Recipe Generation

Guoshan Liu; Hailong Yin; Bin Zhu; Jingjing Chen; Chong-Wah Ngo,; Yu-Gang Jiang

arXiv:2411.08715·cs.CV·December 12, 2024

Retrieval Augmented Recipe Generation

Guoshan Liu, Hailong Yin, Bin Zhu, Jingjing Chen, Chong-Wah Ngo,, Yu-Gang Jiang

PDF

Open Access

TL;DR

This paper introduces a retrieval-augmented large multimodal model with stochastic retrieval and self-consistency voting to improve recipe generation from food images, achieving state-of-the-art results on Recipe1M.

Contribution

The paper proposes a novel retrieval-augmented approach with SDRA and self-consistency voting to enhance recipe generation accuracy and diversity.

Findings

01

Achieves state-of-the-art performance on Recipe1M dataset.

02

Effectively reduces hallucinations in recipe generation.

03

Enhances diversity and relevance of generated recipes.

Abstract

Given the potential applications of generating recipes from food images, this area has garnered significant attention from researchers in recent years. Existing works for recipe generation primarily utilize a two-stage training method, first generating ingredients and then obtaining instructions from both the image and ingredients. Large Multi-modal Models (LMMs), which have achieved notable success across a variety of vision and language tasks, shed light to generating both ingredients and instructions directly from images. Nevertheless, LMMs still face the common issue of hallucinations during recipe generation, leading to suboptimal performance. To tackle this, we propose a retrieval augmented large multimodal model for recipe generation. We first introduce Stochastic Diversified Retrieval Augmentation (SDRA) to retrieve recipes semantically related to the image from an existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games

MethodsSoftmax · Attention Is All You Need