Generating Natural-Language Surgical Feedback: From Structured Representation to Domain-Grounded Evaluation

Firdavs Nasriddinov; Rafal Kocielnik; Anima Anandkumar; Andrew J. Hung

arXiv:2511.15159·cs.CV·November 20, 2025

Generating Natural-Language Surgical Feedback: From Structured Representation to Domain-Grounded Evaluation

Firdavs Nasriddinov, Rafal Kocielnik, Anima Anandkumar, Andrew J. Hung

PDF

Open Access

TL;DR

This paper introduces a structure-aware pipeline that learns a surgical action ontology from real feedback transcripts to generate clinically grounded, trainer-style surgical feedback using GPT-4, improving fidelity and verifiability.

Contribution

It presents a novel method for mining and normalizing surgical action triplets and conditioning feedback generation on these structured representations, enhancing automation and clinical relevance.

Findings

01

Improved video-to-IAT recognition accuracy with context and temporal tracking.

02

Enhanced feedback fidelity, doubling admissible feedback from 21% to 42%.

03

Significant reduction in word error rate and increase in ROUGE scores.

Abstract

High-quality intraoperative feedback from a surgical trainer is pivotal for improving trainee performance and long-term skill acquisition. Automating natural, trainer-style feedback promises timely, accessible, and consistent guidance at scale but requires models that understand clinically relevant representations. We present a structure-aware pipeline that learns a surgical action ontology from real trainer-to-trainee transcripts (33 surgeries) and uses it to condition feedback generation. We contribute by (1) mining Instrument-Action-Target (IAT) triplets from real-world feedback text and clustering surface forms into normalized categories, (2) fine-tuning a video-to-IAT model that leverages the surgical procedure and task contexts as well as fine-grained temporal instrument motion, and (3) demonstrating how to effectively use IAT triplet representations to guide GPT-4o in generating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSurgical Simulation and Training · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education