The Digital Sous Chef -- A Comparative Study on Fine-Tuning Language Models for Recipe Generation
Shubham Pundhir, Ganesh Bagler

TL;DR
This paper establishes a benchmark for recipe generation, demonstrating that fine-tuning large language models with specialized tokenization significantly improves performance over smaller models and traditional methods.
Contribution
It introduces a novel tokenization strategy tailored for recipe text, enabling better preservation of domain-specific structures and quantities in large transformer models.
Findings
Large GPT-2 model outperforms smaller models and RNN baselines in recipe generation.
Specialized tokenization improves model understanding of recipe structures and numerical data.
Model achieves over 20% relative improvement in semantic relevance metrics.
Abstract
We established a rigorous benchmark for text-based recipe generation, a fundamental task in natural language generation. We present a comprehensive comparative study contrasting a fine-tuned GPT-2 large (774M) model against the GPT-2 small (124M) model and traditional LSTM/RNN baselines on the 5-cuisine corpus from RecipeDB. Our key contribution is a targeted tokenization strategy that augments the vocabulary with 23 common fraction tokens and custom structural markers. This approach addresses a critical limitation of generic tokenizers by preserving essential recipe structures and precise numerical quantities, thereby enhancing domain specificity. Performance is evaluated using a comprehensive suite of seven automatic metrics spanning fluency (BLEU-4, METEOR), coherence (ROUGE-L), semantic relevance (BERTScore), and diversity. Our experiments show that the large transformer-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Humanities and Scholarship · Digital Games and Media
