LLM-based ambiguity detection in natural language instructions for collaborative surgical robots
Ana Davila, Jacinto Colan, Yasuhisa Hasegawa

TL;DR
This paper introduces a framework using ensemble Large Language Models to detect ambiguity in natural language instructions for collaborative surgical robots, enhancing safety and reliability in human-robot interactions.
Contribution
It presents a novel ensemble LLM-based method with multiple prompting techniques and conformal prediction for ambiguity detection in surgical instructions.
Findings
Achieved over 60% accuracy in classifying ambiguous instructions.
Demonstrated improved safety in human-robot surgical collaboration.
Utilized multiple LLMs and conformal prediction for robust ambiguity detection.
Abstract
Ambiguity in natural language instructions poses significant risks in safety-critical human-robot interaction, particularly in domains such as surgery. To address this, we propose a framework that uses Large Language Models (LLMs) for ambiguity detection specifically designed for collaborative surgical scenarios. Our method employs an ensemble of LLM evaluators, each configured with distinct prompting techniques to identify linguistic, contextual, procedural, and critical ambiguities. A chain-of-thought evaluator is included to systematically analyze instruction structure for potential issues. Individual evaluator assessments are synthesized through conformal prediction, which yields non-conformity scores based on comparison to a labeled calibration dataset. Evaluating Llama 3.2 11B and Gemma 3 12B, we observed classification accuracy exceeding 60% in differentiating ambiguous from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems
