Safe Navigation for Robotic Digestive Endoscopy via Human Intervention-based Reinforcement Learning
Min Tan, Yushun Tao, Boyun Zheng, GaoSheng Xie, Lijuan Feng, Zeyang, Xia, Jing Xiong

TL;DR
This paper introduces HI-PPO, a reinforcement learning framework that integrates human expertise to improve the safety and efficiency of robotic digestive endoscopy navigation in complex environments.
Contribution
The paper presents a novel HI-PPO framework combining human intervention with reinforcement learning to enhance safety in robotic endoscopy navigation.
Findings
HI-PPO achieves an average trajectory error of 8.02 mm.
HI-PPO attains a security score of 0.862.
Performance is comparable to human experts.
Abstract
With the increasing application of automated robotic digestive endoscopy (RDE), ensuring safe and efficient navigation in the unstructured and narrow digestive tract has become a critical challenge. Existing automated reinforcement learning navigation algorithms often result in potentially risky collisions due to the absence of essential human intervention, which significantly limits the safety and effectiveness of RDE in actual clinical practice. To address this limitation, we proposed a Human Intervention (HI)-based Proximal Policy Optimization (PPO) framework, dubbed HI-PPO, which incorporates expert knowledge to enhance RDE's safety. Specifically, HI-PPO combines Enhanced Exploration Mechanism (EEM), Reward-Penalty Adjustment (RPA), and Behavior Cloning Similarity (BCS) to address PPO's exploration inefficiencies for safe navigation in complex gastrointestinal environments.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAugmented Reality Applications · Gastrointestinal Bleeding Diagnosis and Treatment
MethodsEntropy Regularization · Proximal Policy Optimization
