Chain-of-Though (CoT) prompting strategies for medical error detection and correction
Zhaolong Wu, Abul Hasan, Jinge Wu, Yunsoo Kim, Jason P.Y. Cheung, Teng, Zhang, Honghan Wu

TL;DR
This paper explores the use of Chain-of-Thought prompting with large language models to detect and correct medical errors in clinical notes, achieving competitive rankings in a shared task.
Contribution
It introduces three novel CoT-based methods for medical error detection and correction, including manual prompt design, data-driven reasoning, and an ensemble approach.
Findings
Ensemble method ranked 3rd in error detection and span identification.
Achieved 7th place in error correction among all submissions.
Demonstrated effectiveness of CoT prompting in clinical NLP tasks.
Abstract
This paper describes our submission to the MEDIQA-CORR 2024 shared task for automatically detecting and correcting medical errors in clinical notes. We report results for three methods of few-shot In-Context Learning (ICL) augmented with Chain-of-Thought (CoT) and reason prompts using a large language model (LLM). In the first method, we manually analyse a subset of train and validation dataset to infer three CoT prompts by examining error types in the clinical notes. In the second method, we utilise the training dataset to prompt the LLM to deduce reasons about their correctness or incorrectness. The constructed CoTs and reasons are then augmented with ICL examples to solve the tasks of error detection, span identification, and error correction. Finally, we combine the two methods using a rule-based ensemble method. Across the three sub-tasks, our ensemble method achieves a ranking of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsQuality and Safety in Healthcare · Risk and Safety Analysis
