Comparing Two Model Designs for Clinical Note Generation; Is an LLM a Useful Evaluator of Consistency?
Nathan Brake, Thomas Schaaf

TL;DR
This study compares two approaches for generating clinical SOAP notes from audio recordings, evaluates their consistency using both automatic metrics and LLM-based human-like assessments, and finds that conditioned generation improves note consistency.
Contribution
It introduces and compares two SOAP note generation methods, demonstrating the effectiveness of LLMs like Llama2 in evaluating note consistency and scaling quality assessment.
Findings
Both methods yield similar ROUGE and factuality scores.
LLMs can evaluate note consistency with human-like agreement.
Conditioned generation improves overall note consistency.
Abstract
Following an interaction with a patient, physicians are responsible for the submission of clinical documentation, often organized as a SOAP note. A clinical note is not simply a summary of the conversation but requires the use of appropriate medical terminology. The relevant information can then be extracted and organized according to the structure of the SOAP note. In this paper we analyze two different approaches to generate the different sections of a SOAP note based on the audio recording of the conversation, and specifically examine them in terms of note consistency. The first approach generates the sections independently, while the second method generates them all together. In this work we make use of PEGASUS-X Transformer models and observe that both methods lead to similar ROUGE values (less than 1% difference) and have no difference in terms of the Factuality metric. We perform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHealth Sciences Research and Education · Clinical practice guidelines implementation · Nursing Diagnosis and Documentation
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Residual Connection · Dropout · Multi-Head Attention · Adam · Softmax · Layer Normalization
