# Jacobian Exploratory Dual-Phase Reinforcement Learning for Dynamic Endoluminal Navigation of Deformable Continuum Robots

**Authors:** Yu Tian, Chi Kit Ng, Hongliang Ren

arXiv: 2509.00329 · 2025-09-03

## TL;DR

This paper introduces JEDP-RL, a novel reinforcement learning framework that improves planning and navigation of deformable continuum robots by estimating deformation Jacobians, leading to faster convergence and better generalization in dynamic surgical simulations.

## Contribution

The paper presents a dual-phase RL method that estimates deformation Jacobians during training, enhancing planning for deformable robots with nonlinear mechanics.

## Key findings

- 3.2x faster policy convergence compared to PPO
- 25% fewer steps to reach target in simulations
- 92% success rate under material variations

## Abstract

Deformable continuum robots (DCRs) present unique planning challenges due to nonlinear deformation mechanics and partial state observability, violating the Markov assumptions of conventional reinforcement learning (RL) methods. While Jacobian-based approaches offer theoretical foundations for rigid manipulators, their direct application to DCRs remains limited by time-varying kinematics and underactuated deformation dynamics. This paper proposes Jacobian Exploratory Dual-Phase RL (JEDP-RL), a framework that decomposes planning into phased Jacobian estimation and policy execution. During each training step, we first perform small-scale local exploratory actions to estimate the deformation Jacobian matrix, then augment the state representation with Jacobian features to restore approximate Markovianity. Extensive SOFA surgical dynamic simulations demonstrate JEDP-RL's three key advantages over proximal policy optimization (PPO) baselines: 1) Convergence speed: 3.2x faster policy convergence, 2) Navigation efficiency: requires 25% fewer steps to reach the target, and 3) Generalization ability: achieve 92% success rate under material property variations and achieve 83% (33% higher than PPO) success rate in the unseen tissue environment.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2509.00329/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/2509.00329/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/2509.00329/full.md

---
Source: https://tomesphere.com/paper/2509.00329