Agentic Surgical AI: Surgeon Style Fingerprinting and Privacy Risk Quantification via Discrete Diffusion in a Vision-Language-Action Framework
Huixin Zhan, Jason H. Moore

TL;DR
This paper introduces a novel agentic surgical AI model that predicts surgeon-specific behaviors using a discrete diffusion framework combined with a vision-language-action pipeline, highlighting the trade-off between personalization and privacy risk.
Contribution
It presents a new approach integrating discrete diffusion with multimodal inputs for surgeon-specific gesture prediction, and analyzes privacy risks associated with personalized embeddings.
Findings
Accurately reconstructs gesture sequences for individual surgeons.
Personalized embeddings improve task performance.
Increased privacy risk with more expressive embeddings.
Abstract
Surgeons exhibit distinct operating styles shaped by training, experience, and motor behavior-yet most surgical AI systems overlook this personalization signal. We propose a novel agentic modeling approach for surgeon-specific behavior prediction in robotic surgery, combining a discrete diffusion framework with a vision-language-action (VLA) pipeline. Gesture prediction is framed as a structured sequence denoising task, conditioned on multimodal inputs including surgical video, intent language, and personalized embeddings of surgeon identity and skill. These embeddings are encoded through natural language prompts using third-party language models, allowing the model to retain individual behavioral style without exposing explicit identity. We evaluate our method on the JIGSAWS dataset and demonstrate that it accurately reconstructs gesture sequences while learning meaningful motion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Imaging in Medicine · Face recognition and analysis
MethodsDiffusion
