Application of machine learning to predict food processing level using Open Food Facts
Nalin Arora, Aviral Chauhan, Siddhant Rana, Mahansh Aditya, Sumit Bhagat, Aditya Kumar, Akash Kumar, Akanksh Semar, Ayush Vikram Singh, Ganesh Bagler

TL;DR
This study applies machine learning models to classify food processing levels using a large dataset, revealing health, environmental, and allergenic implications of ultra-processed foods and providing a web tool for NOVA prediction.
Contribution
It introduces the first large-scale machine learning approach to classify food processing levels based on nutrient data from Open Food Facts.
Findings
LightGBM achieved 80-85% accuracy in classification.
Higher NOVA classes are associated with poorer nutritional and environmental scores.
Ultra-processed foods often contain common allergens like gluten and milk.
Abstract
Ultra-processed foods are increasingly linked to health issues like obesity, cardiovascular disease, type 2 diabetes, and mental health disorders due to poor nutritional quality. This first-of-its-kind study at such a scale uses machine learning to classify food processing levels (NOVA) based on the Open Food Facts dataset of over 900,000 products. Models including LightGBM, Random Forest, and CatBoost were trained on nutrient concentration data. LightGBM performed best, achieving 80-85% accuracy across different nutrient panels and effectively distinguishing minimally from ultra-processed foods. Exploratory analysis revealed strong associations between higher NOVA classes and lower Nutri-Scores, indicating poorer nutritional quality. Products in NOVA 3 and 4 also had higher carbon footprints and lower Eco-Scores, suggesting greater environmental impact. Allergen analysis identified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConsumer Attitudes and Food Labeling · Nutritional Studies and Diet · Nutrition, Genetics, and Disease
