Learning to See Forces: Surgical Force Prediction with RGB-Point Cloud Temporal Convolutional Networks
Cong Gao, Xingtong Liu, Michael Peven, Mathias Unberath, Austin, Reiter

TL;DR
This paper introduces a deep learning approach using RGB and point cloud data with temporal convolutional networks to predict surgical forces from visual cues, aiming to replace hardware-based force feedback in minimally invasive surgery.
Contribution
It presents a novel method combining RGB video and point cloud data with temporal convolutional networks for force prediction in surgical settings, validated on phantom and ex vivo tissue.
Findings
Achieved a mean absolute error of 0.814 N in ex vivo tissue
Demonstrated feasibility of visual force inference as a low-cost alternative
Validated approach on phantom and biological tissue samples
Abstract
Robotic surgery has been proven to offer clear advantages during surgical procedures, however, one of the major limitations is obtaining haptic feedback. Since it is often challenging to devise a hardware solution with accurate force feedback, we propose the use of "visual cues" to infer forces from tissue deformation. Endoscopic video is a passive sensor that is freely available, in the sense that any minimally-invasive procedure already utilizes it. To this end, we employ deep learning to infer forces from video as an attractive low-cost and accurate alternative to typically complex and expensive hardware solutions. First, we demonstrate our approach in a phantom setting using the da Vinci Surgical System affixed with an OptoForce sensor. Second, we then validate our method on an ex vivo liver organ. Our method results in a mean absolute error of 0.814 N in the ex vivo study,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
