Data Augmentation to Improve Large Language Models in Food Hazard and   Product Detection

Areeg Fahad Rasheed; M. Zarkoosh; Shimam Amer Chasib; Safa F. Abbas

arXiv:2502.08687·cs.CL·February 14, 2025

Data Augmentation to Improve Large Language Models in Food Hazard and Product Detection

Areeg Fahad Rasheed, M. Zarkoosh, Shimam Amer Chasib, Safa F. Abbas

PDF

Open Access 1 Repo

TL;DR

This study shows that data augmentation with ChatGPT-4o-mini enhances large language models' performance in food hazard and product detection tasks, improving key metrics over using original data alone.

Contribution

It introduces a novel data augmentation approach using ChatGPT-4o-mini to improve LLMs in food hazard detection, demonstrating measurable performance gains.

Findings

01

Improved recall, F1 score, precision, and accuracy with augmented data.

02

Augmentation significantly outperforms models trained on original data.

03

Code and datasets are publicly available for replication.

Abstract

The primary objective of this study is to demonstrate the impact of data augmentation using ChatGPT-4o-mini on food hazard and product analysis. The augmented data is generated using ChatGPT-4o-mini and subsequently used to train two large language models: RoBERTa-base and Flan-T5-base. The models are evaluated on test sets. The results indicate that using augmented data helped improve model performance across key metrics, including recall, F1 score, precision, and accuracy, compared to using only the provided dataset. The full code, including model training and the augmented dataset, can be found in this repository: https://github.com/AREEG94FAHAD/food-hazard-prdouct-cls

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

areeg94fahad/food-hazard-prdouct-cls
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFood Safety and Hygiene · Food Supply Chain Traceability