Prompt-and-Check: Using Large Language Models to Evaluate Communication Protocol Compliance in Simulation-Based Training

Vishakha Lall; Yisi Liu

arXiv:2508.08652·cs.AI·August 13, 2025

Prompt-and-Check: Using Large Language Models to Evaluate Communication Protocol Compliance in Simulation-Based Training

Vishakha Lall, Yisi Liu

PDF

Open Access

TL;DR

This paper introduces Prompt-and-Check, a lightweight method using open-source large language models to evaluate communication protocol compliance in simulation training, demonstrating effective context-aware reasoning without task-specific training.

Contribution

The paper presents a novel prompt-based approach for assessing protocol compliance using open-source LLMs, enabling efficient evaluation solely from transcribed verbal exchanges.

Findings

01

Models achieved high agreement with expert annotations.

02

Prompting enabled effective context-aware reasoning.

03

Method ran efficiently on consumer-grade GPUs.

Abstract

Accurate evaluation of procedural communication compliance is essential in simulation-based training, particularly in safety-critical domains where adherence to compliance checklists reflects operational competence. This paper explores a lightweight, deployable approach using prompt-based inference with open-source large language models (LLMs) that can run efficiently on consumer-grade GPUs. We present Prompt-and-Check, a method that uses context-rich prompts to evaluate whether each checklist item in a protocol has been fulfilled, solely based on transcribed verbal exchanges. We perform a case study in the maritime domain with participants performing an identical simulation task, and experiment with models such as LLama 2 7B, LLaMA 3 8B and Mistral 7B, running locally on an RTX 4070 GPU. For each checklist item, a prompt incorporating relevant transcript excerpts is fed into the model,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman-Automation Interaction and Safety · Maritime Navigation and Safety · Speech and dialogue systems