Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AI
Mahyar Abbasian, Elahe Khatibi, Iman Azimi, David Oniani, Zahra, Shakeri Hossein Abad, Alexander Thieme, Ram Sriram, Zhongqi Yang, Yanshan, Wang, Bryant Lin, Olivier Gevaert, Li-Jia Li, Ramesh Jain, Amir M. Rahmani

TL;DR
This paper reviews and proposes comprehensive evaluation metrics tailored for healthcare chatbots powered by generative AI, emphasizing medical accuracy, user trust, empathy, and real-world clinical impact.
Contribution
It introduces a new set of evaluation metrics specifically designed for healthcare conversational AI, addressing gaps in existing generic LLM assessment methods.
Findings
Existing metrics lack medical and emotional context understanding.
Proposed metrics evaluate language, clinical impact, and user interaction.
Discussion on challenges in implementing healthcare-specific evaluation metrics.
Abstract
Generative Artificial Intelligence is set to revolutionize healthcare delivery by transforming traditional patient care into a more personalized, efficient, and proactive process. Chatbots, serving as interactive conversational models, will probably drive this patient-centered transformation in healthcare. Through the provision of various services, including diagnosis, personalized lifestyle recommendations, and mental health support, the objective is to substantially augment patient health outcomes, all the while mitigating the workload burden on healthcare providers. The life-critical nature of healthcare applications necessitates establishing a unified and comprehensive set of evaluation metrics for conversational models. Existing evaluation metrics proposed for various generic large language models (LLMs) demonstrate a lack of comprehension regarding medical and health concepts and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare · Digital Mental Health Interventions
