On Recipe Memorization and Creativity in Large Language Models: Is Your Model a Creative Cook, a Bad Cook, or Merely a Plagiator?
Jan Kvapil, Martin Fajcik

TL;DR
This paper examines how large language models generate cooking recipes, focusing on memorization versus creativity, and introduces an automated pipeline to analyze and scale this assessment effectively.
Contribution
It presents a novel human annotation method for analyzing recipe generation and develops an automated pipeline to evaluate memorization and creativity in LLMs at scale.
Findings
Mixtral relies heavily on memorized ingredients from online sources.
The Llama 3.1+Gemma 2 9B model achieves 78% accuracy in ingredient extraction.
Automated framework enables large-scale analysis of recipe generation.
Abstract
This work-in-progress investigates the memorization, creativity, and nonsense found in cooking recipes generated from Large Language Models (LLMs). Precisely, we aim (i) to analyze memorization, creativity, and non-sense in LLMs using a small, high-quality set of human judgments and (ii) to evaluate potential approaches to automate such a human annotation in order to scale our study to hundreds of recipes. To achieve (i), we conduct a detailed human annotation on 20 preselected recipes generated by LLM (Mixtral), extracting each recipe's ingredients and step-by-step actions to assess which elements are memorized--i.e., directly traceable to online sources possibly seen during training--and which arise from genuine creative synthesis or outright nonsense. We find that Mixtral consistently reuses ingredients that can be found in online documents, potentially seen during model training,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCulinary Culture and Tourism · Sentiment Analysis and Opinion Mining · AI in Service Interactions
