# Stable distance regression via spatial–frequency state space model for robot-assisted endomicroscopy

**Authors:** Mengyi Zhou, Chi Xu, Stamatia Giannarou

PMC · DOI: 10.1007/s11548-025-03353-w · International Journal of Computer Assisted Radiology and Surgery · 2025-04-12

## TL;DR

This paper introduces a new model for accurately measuring distances in microscopic imaging during robotic surgeries, improving precision and stability.

## Contribution

The novel SF-BiS4D model processes images bidirectionally in spatial and frequency domains for improved distance regression in endomicroscopy.

## Key findings

- The SF-BiS4D model outperforms existing methods in accuracy and stability for pCLE distance regression.
- A guided trajectory planning strategy generates pseudo-distance labels for training sequential models.
- Hierarchical guided fine-tuning reduces model size without sacrificing performance.

## Abstract

Probe-based confocal laser endomicroscopy (pCLE) is a noninvasive technique that enables the direct visualization of tissue at a microscopic level in real time. One of the main challenges in using pCLE is maintaining the probe within a working range of micrometer scale. As a result, the need arises for automatically regressing the probe–tissue distance to enable precise robotic tissue scanning.

In this paper, we propose the spatial frequency bidirectional structured state space model (SF-BiS4D) for pCLE probe–tissue distance regression. This model advances traditional state space models by processing image sequences bidirectionally and analyzing data in both the frequency and spatial domains. Additionally, we introduce a guided trajectory planning strategy that generates pseudo-distance labels, facilitating the training of sequential models to generate smooth and stable robotic scanning trajectories. To improve inference speed, we also implement a hierarchical guided fine-tuning (GF) approach that efficiently reduces the size of the BiS4D model while maintaining performance.

The performance of our proposed model has been evaluated both qualitatively and quantitatively using the pCLE regression dataset (PRD). In comparison with existing state-of-the-art (SOTA) methods, our approach demonstrated superior performance in terms of accuracy and stability.

Our proposed deep learning-based framework effectively improves distance regression for microscopic visual servoing and demonstrates its potential for integration into surgical procedures requiring precise real-time intraoperative imaging.

The online version contains supplementary material available at 10.1007/s11548-025-03353-w.

## Full-text entities

- **Diseases:** tumor (MESH:D009369), ACC (MESH:D004476)
- **Chemicals:** GTP (-)
- **Species:** Sus scrofa (pig, species) [taxon 9823], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12167353/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12167353/full.md

## References

7 references — full list in the complete paper: https://tomesphere.com/paper/PMC12167353/full.md

---
Source: https://tomesphere.com/paper/PMC12167353