# Personalized Glucose Management With AI: Pilot Study Using a Multiarmed Bandit Approach

**Authors:** Shinji Hotta, Mikko Kytö, Saila Koivusalo, Seppo Heinonen, Pekka Marttinen

PMC · DOI: 10.2196/70826 · 2026-03-19

## TL;DR

This study introduces an AI method using a multiarmed bandit approach to personalize dietary and exercise recommendations for better glucose control in diabetes prevention.

## Contribution

A novel two-stage reward prediction model for online glucose management using a multiarmed bandit framework.

## Key findings

- The proposed algorithm significantly improved postprandial glucose levels compared to a randomized policy in simulations.
- A real-world experiment showed a 23% average improvement in glucose responses with personalized recommendations.
- Participants demonstrated behavioral adherence to carbohydrate intake and postprandial walking recommendations.

## Abstract

Personalized behavioral recommendations through mobile apps have proven effective in preventing serious chronic diseases such as diabetes. Recent studies have primarily focused on optimizing personalized recommendations using reinforcement learning. However, the main problem with these approaches is that they focus on behavioral changes and overlook clinical outcomes.

This study aimed to propose a method for online planning of dietary and exercise recommendations to optimize postprandial glucose levels through behavioral changes directly.

The proposed method is a multiarmed bandit based on a two-stage reward prediction model, where an action is a combination of the total carbohydrate intake and postprandial walking duration, and the reward is the reduction in postprandial glucose levels. We implemented the prediction of the reward for each action based on the predicted behavioral responses to an action, and subsequently, the postprandial glycemic response.

In a simulation experiment, we demonstrated that the proposed online algorithm can significantly improve postprandial glucose levels with personalized recommendations, compared to the randomized policy. Furthermore, we conducted a small real-world experiment with a simplified proposed method involving a single update of the recommendation policy into a personalized one. For 6 participants, compared to the randomized policy, we observed a 23% improvement, on average, in actual glucose responses along with the behavioral adherence to the recommendations concerning carbohydrate intake and postprandial walking.

The preliminary effectiveness of the proposed method was demonstrated from both the simulation experiment and the small real-world experiment. However, further longitudinal real-world experiments in patients with diabetes are needed to validate and generalize the findings.

## Linked entities

- **Diseases:** diabetes (MONDO:0005015)

## Full-text entities

- **Genes:** INS (insulin) [NCBI Gene 3630] {aka IDDM, IDDM1, IDDM2, ILPR, IRDN, MODY10}
- **Diseases:** chronic diseases (MESH:D002908), iAUC (MESH:D001927), hypertension (MESH:D006973), fatigue (MESH:D005221), prediabetes (MESH:D011236), diseases (MESH:D004194), diabetes (MESH:D003920)
- **Chemicals:** carbohydrate (MESH:D002241), Glucose (MESH:D005947)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13010317/full.md

---
Source: https://tomesphere.com/paper/PMC13010317