Machine learning and natural language processing models to predict the extent of food processing

Nalin Arora; Sumit Bhagat; Riya Dhama; Ganesh Bagler

arXiv:2412.17217·q-bio.BM·June 24, 2025

Machine learning and natural language processing models to predict the extent of food processing

Nalin Arora, Sumit Bhagat, Riya Dhama, Ganesh Bagler

PDF

Open Access 1 Repo

TL;DR

This study develops machine learning and NLP models to accurately predict the level of food processing using nutrient profiles, aiding public health efforts to identify ultra-processed foods.

Contribution

It introduces integrated ML, deep learning, and NLP models that utilize nutrient data to classify food processing levels, including a user-friendly web server for practical application.

Findings

01

Best models achieved F1-scores above 0.93.

02

Nutrient panels of 13-102 features yield high prediction accuracy.

03

NLP models demonstrated state-of-the-art performance.

Abstract

The dramatic increase in consumption of ultra-processed food has been associated with numerous adverse health effects. Given the public health consequences linked to ultra-processed food consumption, it is highly relevant to build computational models to predict the processing of food products. We created a range of machine learning, deep learning, and NLP models to predict the extent of food processing by integrating the FNDDS dataset of food products and their nutrient profiles with their reported NOVA processing level. Starting with the full nutritional panel of 102 features, we further implemented coarse-graining of features to 65 and 13 nutrients by dropping flavonoids and then by considering the 13-nutrient panel of FDA, respectively. LGBM Classifier and Random Forest emerged as the best model for 102 and 65 nutrients, respectively, with an F1-score of 0.9411 and 0.9345 and MCC of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cosylabiiit/nova_food_processing
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFood Industry and Aquatic Biology