# A Hybrid Closed-Loop Blood Glucose Control Algorithm with a Safety Limiter Based on Deep Reinforcement Learning and Model Predictive Control

**Authors:** Shanyong Huang, Yusheng Fu, Shaowei Kong, Yuyang Liu, Jian Yan

PMC · DOI: 10.3390/bios16010047 · Biosensors · 2026-01-06

## TL;DR

This paper introduces a new blood glucose control system that combines deep reinforcement learning with model predictive control to safely manage diabetes treatment.

## Contribution

A hybrid control algorithm that integrates deep reinforcement learning and model predictive control for safer blood glucose regulation in diabetic patients.

## Key findings

- The proposed model achieves a 72.51% time proportion of adult patients within the healthy blood glucose range.
- The algorithm avoids unsafe actions by using a safety controller based on model predictive control.
- The system outperforms the baseline model without increasing severe hyperglycemia or hypoglycemia events.

## Abstract

Due to the complexity of blood glucose dynamics and the high variability of the physiological structure of diabetic patients, implementing a safe and effective insulin dosage control algorithm to keep the blood glucose of diabetic patients within the normal range (70–180 mg/dL) is currently a challenging task in the field of diabetes treatment. Deep reinforcement learning (DRL) has proven its potential in diabetes treatment in previous work, thanks to its strong advantages in solving complex dynamic and uncertain problems. It can address the challenges faced by traditional control algorithms, such as the need for patients to manually estimate carbohydrate intake before meals, the requirement to establish complex dynamic models, and the need for professional prior knowledge. However, reinforcement learning is essentially a highly exploratory trial-and-error learning strategy, which is contrary to the high-safety requirements of clinical practice. Therefore, achieving safer control has always been a major challenge for the clinical application of DRL. This paper addresses this challenge by combining the advantages of DRL and the traditional control algorithm—model predictive control (MPC). Specifically, by using the blood glucose and insulin data generated during the interaction between DRL and patients in the learning process to learn a blood glucose prediction model, the problem of MPC needing to establish a patient’s blood glucose dynamic model is solved. Then, MPC is used for forward-looking prediction and simulation of blood glucose, and a safety controller is introduced to avoid unsafe actions, thus restricting DRL control to a safer range. Experiments on the UVA/Padova glucose kinetics simulator approved by the US Food and Drug Administration (FDA) show that the time proportion of adult patients within the healthy blood glucose range under the control of the model proposed in this paper reaches 72.51%, an increase of 2.54% compared with the baseline model, and the proportion of severe hyperglycemia and hypoglycemia events is not increased, taking an important step towards the safe control of blood glucose.

## Linked entities

- **Diseases:** diabetes (MONDO:0005015)

## Full-text entities

- **Genes:** INS (insulin) [NCBI Gene 3630] {aka IDDM, IDDM1, IDDM2, ILPR, IRDN, MODY10}
- **Diseases:** diabetes (MESH:D003920), hyperglycemia (MESH:D006943), hypoglycemia (MESH:D007003)
- **Chemicals:** carbohydrate (MESH:D002241), Blood Glucose (MESH:D001786), glucose (MESH:D005947)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12838616/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12838616/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/PMC12838616/full.md

---
Source: https://tomesphere.com/paper/PMC12838616