SutureFormer: Learning Surgical Trajectories via Goal-conditioned Offline RL in Pixel Space

Huanrong Liu; Chunlin Tian; Tongyu Jia; Tailai Zhou; Qin Liu; Yu Gao; Yutong Ban; Yun Gu; Guy Rosman; Xin Ma; Qingbiao Li

arXiv:2603.26720·cs.RO·May 19, 2026

SutureFormer: Learning Surgical Trajectories via Goal-conditioned Offline RL in Pixel Space

Huanrong Liu, Chunlin Tian, Tongyu Jia, Tailai Zhou, Qin Liu, Yu Gao, Yutong Ban, Yun Gu, Guy Rosman, Xin Ma, Qingbiao Li

PDF

TL;DR

SutureFormer is a goal-conditioned offline reinforcement learning framework that predicts surgical needle trajectories from endoscopic videos by modeling sequential pixel-wise motion, improving accuracy over existing methods.

Contribution

It introduces a novel formulation of needle trajectory prediction as a sequential decision-making problem using offline RL with dense rewards from sparse annotations.

Findings

01

Reduces Average Displacement Error by 58.6% compared to baselines.

02

Effectively models continuous, physically plausible needle motion in pixel space.

03

Demonstrates superior performance on a new kidney wound suturing dataset.

Abstract

Predicting surgical needle trajectories from endoscopic video is critical for robot-assisted suturing, enabling anticipatory planning, real-time guidance, and safer motion execution. Existing methods that directly learn motion distributions from visual observations tend to overlook the sequential dependency among adjacent motion steps. Moreover, sparse waypoint annotations often fail to provide sufficient supervision, further increasing the difficulty of supervised or imitation learning methods. To address these challenges, we formulate image-based needle trajectory prediction as a sequential decision-making problem, in which the needle tip is treated as an agent that moves step by step in pixel space. This formulation naturally captures the continuity of needle motion and enables the explicit modeling of physically plausible pixel-wise state transitions over time. From this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.