Evaluating ChatGPT on Medical Information Extraction Tasks: Performance, Explainability and Beyond
Liz Li, Wei Zhu

TL;DR
This study evaluates ChatGPT's performance, explainability, confidence, and faithfulness in medical information extraction tasks, revealing strengths in explainability but limitations in accuracy and over-confidence compared to specialized models.
Contribution
It provides a comprehensive analysis of ChatGPT's capabilities and limitations in medical information extraction, highlighting areas for improvement and potential challenges in clinical applications.
Findings
ChatGPT underperforms compared to fine-tuned models in MedIE tasks.
ChatGPT offers high-quality explanations but is often over-confident.
Uncertainty in generation affects extraction reliability.
Abstract
Large Language Models (LLMs) like ChatGPT have demonstrated amazing capabilities in comprehending user intents and generate reasonable and useful responses. Beside their ability to chat, their capabilities in various natural language processing (NLP) tasks are of interest to the research community. In this paper, we focus on assessing the overall ability of ChatGPT in 4 different medical information extraction (MedIE) tasks across 6 benchmark datasets. We present the systematically analysis by measuring ChatGPT's performance, explainability, confidence, faithfulness, and uncertainty. Our experiments reveal that: (a) ChatGPT's performance scores on MedIE tasks fall behind those of the fine-tuned baseline models. (b) ChatGPT can provide high-quality explanations for its decisions, however, ChatGPT is over-confident in its predcitions. (c) ChatGPT demonstrates a high level of faithfulness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · Explainable Artificial Intelligence (XAI)
