# The distinct roles of reinforcement learning between pre-procedure and intra-procedure planning for prostate biopsy

**Authors:** Iani J. M. B. Gayo, Shaheer U. Saeed, Ester Bonmati, Dean C. Barratt, Matthew J. Clarkson, Yipeng Hu

PMC · DOI: 10.1007/s11548-024-03084-4 · 2024-03-07

## TL;DR

This paper explores how reinforcement learning improves prostate biopsy accuracy during procedures by adapting to motion and registration errors, compared to pre-planned strategies.

## Contribution

The study demonstrates the novel use of reinforcement learning for intra-procedure planning in prostate biopsy, showing improved performance over imitation learning.

## Key findings

- Reinforcement learning outperforms imitation learning in intra-procedure planning under motion and registration errors.
- Biopsy sampling performance improved significantly with RL-based intra-procedure planning compared to pre-procedure planning alone.
- Results suggest that RL can provide intelligent action suggestions during procedures, reducing targeting errors.

## Abstract

Magnetic resonance (MR) imaging targeted prostate cancer (PCa) biopsy enables precise sampling of MR-detected lesions, establishing its importance in recommended clinical practice. Planning for the ultrasound-guided procedure involves pre-selecting needle sampling positions. However, performing this procedure is subject to a number of factors, including MR-to-ultrasound registration, intra-procedure patient movement and soft tissue motions. When a fixed pre-procedure planning is carried out without intra-procedure adaptation, these factors will lead to sampling errors which could cause false positives and false negatives. Reinforcement learning (RL) has been proposed for procedure plannings on similar applications such as this one, because intelligent agents can be trained for both pre-procedure and intra-procedure planning. However, it is not clear if RL is beneficial when it comes to addressing these intra-procedure errors.

In this work, we develop and compare imitation learning (IL), supervised by demonstrations of predefined sampling strategy, and RL approaches, under varying degrees of intra-procedure motion and registration error, to represent sources of targeting errors likely to occur in an intra-operative procedure.

Based on results using imaging data from 567 PCa patients, we demonstrate the efficacy and value in adopting RL algorithms to provide intelligent intra-procedure action suggestions, compared to IL-based planning supervised by commonly adopted policies.

The improvement in biopsy sampling performance for intra-procedure planning has not been observed in experiments with only pre-procedure planning. These findings suggest a strong role for RL in future prospective studies which adopt intra-procedure planning. Our open source code implementation is available here.

## Linked entities

- **Diseases:** prostate cancer (MONDO:0005159)

## Full-text entities

- **Diseases:** PCa (MESH:D011471)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11178630/full.md

---
Source: https://tomesphere.com/paper/PMC11178630