arg-VU: Affordance Reasoning with Physics-Aware 3D Geometry for Visual Understanding in Robotic Surgery
Nan Xiao, Yunxin Fan, Farong Wang, Fei Liu

TL;DR
arg-VU introduces a physics-aware framework for surgical scene understanding that integrates geometry tracking and mechanical modeling to improve affordance reasoning in deformable tissues.
Contribution
The paper presents a novel physics-aware affordance reasoning framework that combines geometry tracking with mechanical modeling for surgical visual understanding.
Findings
arg-VU produces more stable and interpretable affordance predictions.
The framework demonstrates improved physical consistency over kinematic baselines.
Experiments show reliable affordance reasoning in deformable surgical environments.
Abstract
Affordance reasoning provides a principled link between perception and action, yet remains underexplored in surgical robotics, where tissues are highly deformable, compliant, and dynamically coupled with tool motion. We present arg-VU, a physics-aware affordance reasoning framework that integrates temporally consistent geometry tracking with constraint-induced mechanical modeling for surgical visual understanding. Surgical scenes are reconstructed using 3D Gaussian Splatting (3DGS) and converted into a temporally tracked surface representation. Extended Position-Based Dynamics (XPBD) embeds local deformation constraints and produces representative geometry points (RGPs) whose constraint sensitivities define anisotropic stiffness metrics capturing the local constraint-manifold geometry. Robotic tool poses in SE(3) are incorporated to compute rigidly induced displacements at RGPs, from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
