SurgiPose: Estimating Surgical Tool Kinematics from Monocular Video for Surgical Robot Learning
Juo-Tung Chen, XinHao Chen, Ji Woong Kim, Paul Maria Scheikl, Richard Jaepyeong Cha, and Axel Krieger

TL;DR
SurgiPose introduces a differentiable rendering method to estimate surgical tool kinematics from monocular videos, enabling large-scale robot learning without ground truth data, and demonstrating comparable policy performance to ground truth-based training.
Contribution
The paper presents a novel monocular pose estimation approach using differentiable rendering for surgical tools, facilitating autonomous surgical robot learning from online videos.
Findings
Estimated kinematics enable policies with success rates similar to ground truth data.
The method successfully infers tool trajectories and joint angles from monocular videos.
Demonstrates feasibility of large-scale autonomous surgical policy learning.
Abstract
Imitation learning (IL) has shown immense promise in enabling autonomous dexterous manipulation, including learning surgical tasks. To fully unlock the potential of IL for surgery, access to clinical datasets is needed, which unfortunately lack the kinematic data required for current IL approaches. A promising source of large-scale surgical demonstrations is monocular surgical videos available online, making monocular pose estimation a crucial step toward enabling large-scale robot learning. Toward this end, we propose SurgiPose, a differentiable rendering based approach to estimate kinematic information from monocular surgical videos, eliminating the need for direct access to ground truth kinematics. Our method infers tool trajectories and joint angles by optimizing tool pose parameters to minimize the discrepancy between rendered and real images. To evaluate the effectiveness of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurgical Simulation and Training · Soft Robotics and Applications · Robot Manipulation and Learning
