# A Methodological Framework for Aggregating Branded Food Composition Data in mHealth Nutrition Databases: A Case Presentation

**Authors:** Antonis Vlassopoulos, Stefania Xanthopoulou, Sofia Eleftheriou, Ioannis Koutsias, Maria C. Giannakourou, Anastasia Kanellou, Maria Kapsokefalou

PMC · DOI: 10.3390/nu18020359 · 2026-01-22

## TL;DR

This paper presents a framework for aggregating branded food data to update nutrition databases used in mHealth apps.

## Contribution

The paper introduces a three-step framework for aggregating branded food composition data into generic food names.

## Key findings

- 347 new generic food names were proposed from branded products.
- Aggregated energy, protein, and carbohydrate values showed high homogeneity.
- Baked goods and milk products had higher heterogeneity in nutritional values.

## Abstract

Background/Objectives: Up-to-date, relevant and detailed food composition databases (FCDs) are a central component of mHealth apps. Thus, the expansion and/or update of such FCDs though the aggregation of branded food data (BFCDs) could prove as a cost-efficient methodology. However, a framework for data aggregation from BFCDs has yet to be documented. Methods: Products (n = 3988) available in the HelTH BFCD were grouped following a three-step process. Firstly, foods were grouped based on their name, and then the aggregated nutritional composition was tested for heterogeneity using a coefficient of variation cut-off of 20% followed by a search of the ingredient list and other product characteristics to identify descriptors that reduced heterogeneity. Results: Following a three-step process, n = 347 new generic food names were proposed, each derived from at least three branded products, of which n = 235 were populated with aggregated nutritional content values. We found that 95.3%, 88.6%, 86% and 82.6% of aggregated energy, protein, carbohydrate and sodium values, respectively, had a coefficient of variation <40%. Aggregated saturated fatty acid and total sugar values were less likely to fall in the homogeneity level (76.3% and 65.3%, respectively). The heterogeneity was concentrated in specific subcategories like baked goods, milk products and milk imitation products, primarily. Conclusions: BFCDs can be used as a resource to expand existing databases with relatively homogeneous and up-to-date nutritional composition data. The application of this framework on larger datasets could improve the generic food name yield and homogeneity and support mHealth apps and other uses.

## Full-text entities

- **Chemicals:** saturated fatty acid (MESH:D005227), sugar (MESH:D000073893), sodium (MESH:D012964), carbohydrate (MESH:D002241)

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12844821/full.md

---
Source: https://tomesphere.com/paper/PMC12844821